-
How to perform batch inference with swift? I don't see it mentioned anywhere in the docs and I cannot find it in the code either.
-
Hi, Is it possible to batch inference like LLMs do? Such as provide 10 transcripts and batch the requests to increase total throughput?
-
Hi!
I'm evaluating the model on a relatively large dataset (single question, single answer). I was able to fine-tune the Bunny-1.1-Llama-3-8B-V model using one of the scripts provided. What is the …
-
Hi authors,
I want to test the performance of the Mistral7B on the test dataset. Is it only possible to do single sample inference (with model. generate(...))? Are there any methods to accelerate t…
-
A great job, are there any tips on setting up bounding boxes to perform batch inferencing?
iu110 updated
1 month ago
-
Would be nice to have batch inference support similar to [`mlx_parallm`](https://github.com/willccbb/mlx_parallm), happy to try and add soon. @Blaizzy can you assign this to me?
-
### System Info
x85-64
4 A10
0.9.0
### Who can help?
_No response_
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported tas…
-
- [x] Use `llama_decode` instead of deprecated `llama_eval` in `Llama` class
- [ ] Implement batched inference support for `generate` and `create_completion` methods in `Llama` class
- [ ] Add suppo…
-
Hi there! Great work!
Is it possible to run a batched inference?
Thanks!
-
Wonder does it support batch inference?
I read the code of eval. Seems each time it only eval on one video.