-
batch inference dont seem to be working. Would you mind to provide an example of batch inference for model.predict. It seems that it only works for the batch size of 1.
-
Thanks for your interesting work and for sharing the code.
In the README, you only provide examples of how to generate captions for one image at a time (batch size = 1). Could you (@Yushi-Hu) expl…
-
How to inference with multi-batch?
Is any multi-batch inference example with C api provided?
The docs are unclear and obscure. what is the meaning of model input height and how to set h_stride? pro…
-
Hi,
I need to perform batch inference for my use-case. I followed this thread [here](https://github.com/facebookresearch/detectron2/issues/282#issuecomment-562386367) that extends the `DefaultPredi…
-
Do you have code for batch processing images?I want to use my own dataset for batch inference.Looking forward to your reply
-
Thanks for your hard work. I tried to conduct batch inference but encountered some errors. My code looks like:
prompts = tokenizer(test_dataset, return_tensors='pt', padding=True, truncation=True)
…
-
Nice work! In the paper I saw this batched result:
But examples like https://github.com/Infini-AI-Lab/TriForce/blob/main/test/on_chip.py only use batch size=1. Does the code supports batched sp…
-
### System Info
GPU Name: NVIDIA A800
TensorRT-LLM: 0.10.0
Nvidia Driver: 535.129.03
OS: Ubuntu 22.04
triton-inference-server backend:tensorrtllm_backend
### Who can help?
_No response_
### I…
-
when I inference multi batch, it raise an Error
RuntimeError: output with shape [1, 32, 84, 128] doesn't match the broadcast shape [2, 32, 84, 128]
-
I just confused that i cannot get the clue about the cond and uncond quantization part you claim in your paper in the code, could you give me a hand?