vlms Search Results - Githubissues

250 results
for vlms

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

roboflow/notebooks #294

Potential bug in mAP computation of Florence-2 fine-tuning n…

### Search before asking - [X] I have searched the Roboflow Notebooks [issues](https://github.com/roboflow-ai/notebooks/issues) and found no similar bug report. ### Notebook name [Fine-tuning Flor…

patel-zeel updated 1 month ago
6
TRI-ML/vlm-evaluation #4

Evaluation hangs with accelerate over multiple gpus.

Thank you for the incredible set of repositories (this one and prismatic-vlms), it has been a great joy using them. Very well-designed, configurable, and easy to use for researchers. I'm running in…

tyleryzhu updated 4 months ago
3
jefferyZhan/Griffon #5

Any timeline of data release

I am very glad that someone has finally realized that keeping a image resolution of 224 or 336 is not enough to build strong VLMs for complicated vision tasks such as detection/counting 😄 Do you h…

HaisongDing updated 5 months ago
3
X-PLUG/mPLUG-DocOwl #44

Question about how to eval mPLUG-PaperOwl or other VLMs on M…

Hello, I would like to ask how to test on M-Paper dataset? For example, for the task Multimodal Diagram Analysis, its input needs to be 𝐶𝑜𝑛𝑡𝑒𝑥𝑡 + 𝐷𝑖𝑎𝑔𝑟𝑎𝑚 𝑠 + 𝑂𝑢𝑡𝑙𝑖𝑛𝑒, and the question instructions, s…

sky-fly97 updated 5 months ago
1
scene-verse/SceneVerse #9

How to get multi-view images

Hi! Thanks for your great work! I am curious about how to get multi-view object images in the "Object Caption" step of your annotation pipeline. It seems that only a 3D point cloud and object bounding…

ZJHTerry18 updated 2 weeks ago
3
open-compass/VLMEvalKit #190

best to have docker image

1. all datasets are downloaded 2. all requirements are installed 3. all dependency repos are prepared

max-yue updated 2 days ago
2
stevenyangyj/Emma-Alfworld #4

Details about VLM baselines

Thanks for sharing your great work! I have a few questions about your work, especially regarding the baselines. 1. Did you fine-tune the VLMs reported in Table 1? I got confused because Section 3.…

lazyLuizi updated 3 months ago
1
huggingface/transformers #31096

Caching Past Key values of any length for Vision LLM's

### Feature request Allowing passing past key values during the forward pass of more than one token similar to the text large language models. ### Motivation According to the documentation [here](…

saikoneru updated 3 months ago
2
huggingface/transformers #32042

`AutoModel` class for `image-text-to-text` models

### Feature request It would be nice to get a standard `AutoModel` class for `image-text-to-text` models (since @molbap is standardizing the processor) ### Motivation @NielsRogge noticed tha…

merveenoyan updated 2 months ago
3
Gene-Weaver/VoucherVision #37

Document process for addition of new OCR engine / model

I can see that there are multiple issues of the form "add X as a new OCR engine": - #17 - #18 - #19 - #36 ... therefore would it be sensible to document the steps and / or rearchitect such that t…

nickynicolson updated 1 week ago
1

上一页 1...1 2 3 4 5 6 7...25 下一页

250 results for vlms

250 results
for vlms