-
Hello everyone,
I'm trying to create 20 kg wheels. They work with the open and split differentials. But they are very unstable with viscous and even more unstable with the locked differential.
T…
-
**Is your feature request related to a problem? Please describe.**
I currently run the encoder ONNX, get features then prepare things like `input_ids` and pass to another decoder ONNX multiple times.…
-
We have loads of spectroscopy models but few good tutorials. We need to improve the documentation with some examples for:
- edge impact excitation and recombination
- stark pressure broadening
- …
-
#WIP
## Benchmark with [faster-whisper-large-v3-turbo-ct2](https://huggingface.co/deepdml/faster-whisper-large-v3-turbo-ct2)
For reference, here's the time and memory usage that are required to tr…
-
### Describe the issue as clearly as possible:
Encountered while working on PR [531](https://github.com/outlines-dev/outlines/pull/531)
When generating several samples for a prompt with the tran…
-
Hi there,
I was trying to run the model. After trying:
```
$ python transformer.py --predict uspto-50k/patents_test100.csv.can --model models/base-retrosynthesis.h5
```
I got “list index out of r…
-
**The bug**
Loading and prompting the transformer model `openbmb/MiniCPM-Llama3-V-2_5` does not work.
It tries to load the model (but according to nvtop nothing is allocated on my gpu). No error is …
-
## Motivation
In the current technological landscape, Generative AI (GenAI) workloads and models have gained widespread attention and popularity. Large Language Models (LLMs) have emerged as the dom…
-
Hello, I want to deploy llama-3-8b quantized model using tritonserver I followed below steps to do this:
1. create container with nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3 base image.
3.…
-
Thanks again for releasing all the Closed Boook QA T5 checkpoints. The models are now fully ported to hugging face and the < xl/3b models can also be used on the inference API: https://huggingface.co/…