-
Hi,
Thanks again for your work!
I have question regarding running inference without CRF. In Table 4 of your paper, you report a 2-3 mIoU point decrease, when disabling the CRF. Could you please p…
-
**server:** inf2.8xlarge
**vllm version**: 0.6.3.post2.dev77+g2394962d.neuron215
_Desctiption_
Hellow! I am trying to run the code below (the code was taken [here](https://docs.vllm.ai/en/v0.4.1/…
-
Hi team,
Great paper, looking forward for code release for a personal project and writing blog on LearnOpenCV
-
Currently, `TimeSeriesDataSet` has the option to set the `predict_mode` flag to True, this allows using the whole sequence, except the last portion used for testing purposes, which will be predicted b…
-
## Issue encountered
Currently, inference of open models on my Mac device is quite slow since vllm does not support mps.
## Solution/Feature
Llama.cpp does support mps and would significantly spe…
-
Config:
Windows 10 with RTX4090
All requirements incl. flash-attn build - done!
Server:
```
(venv) D:\PythonProjects\hertz-dev>python inference_server.py
Using device: cuda
Loaded tokeniz…
-
Could you add an easy to run inference script in the repository? I'm having trouble running the edge extraction on arbitrary input image using the checkpoint model you provided in the README.
-
Hi everyone,
I finetune the base model using the bash script provided here `peft/finetune.sh` using these parameters,
```
python3 finetune.py \
--model-name="google/timesfm-1.0-200m" \
…
-
I was trying out the model with 439 characters and saw 5-6 sec of average latency on libri-TTS dataset. Is there a way we can reduce the latency (decoder takes the most time).
Also, I finetuned the m…
-
I suppose this can be improved by utilizing the GPU using TornadoVM.