-
How to perform batch inference with swift? I don't see it mentioned anywhere in the docs and I cannot find it in the code either.
-
Hello, do you know how long in average the individual batch processing runs take i.e, descriptions? I'm able to generate the output successfully using llama locally but it's taking a while as it seems…
-
Hi!
I'm evaluating the model on a relatively large dataset (single question, single answer). I was able to fine-tune the Bunny-1.1-Llama-3-8B-V model using one of the scripts provided. What is the …
-
Hello, I encountered an issue when using the unsloth library for batch inference with the LLaMA3.1 8B Instruct model. When there is a significant difference in input lengths, the output for the shorte…
-
While we support batched inference like other constrained decoding libraries, the current implementation can be parallelized further. In particular, we can mask logits in batch and run several `kbnf` …
-
Hello, I'm trying to run the project locally using docker on a 5 page PDF.
I basically ran:
```
$ git clone https://github.com/huridocs/pdf-document-layout-analysis
$ cd pdf-document-layout-an…
-
Hi, Is it possible to batch inference like LLMs do? Such as provide 10 transcripts and batch the requests to increase total throughput?
-
## Feature Request: Support for Batch Inference Processing deployments with the Terraform Module.
### Overview
It would be great if the module could allow for deployment of batch inference processing…
-
### This issue is for a: (mark with an `x`)
```
- [ ] bug report -> please search issues before submitting
- [x] feature request
- [x] documentation issue or request
- [ ] regression (a behavior …
-
Hello, thank you for Florence2 support!
How can we run Florence2 inference batched? If it supports batched training can it also support batched inference?
Thank you :)