-
In the [MLM ](https://github.com/wherobots/mlm-form) each model has a geospatial footprint. right now it is pretty loose what this represents, so I expect this to be confusing to use for search and di…
-
I'm trying to use the `--segment`/libav feature in [this PR](https://github.com/raspberrypi/libcamera-apps/pull/537), but it doesn't seem to do anything. I built from the repo based on [these instruct…
-
Hello! Thank you for an amazing blog - I have been trying to figure out how to fine tune InstructPix2Pix and it gave me a lot of insight!
In the README, the DATASET_ID and OUTPUT_DIR for the low-le…
-
Hello,
I was wondering if there was a way to run inference on custom image(s) without gt annotations. For example, if I wanted to generate scene-graph using your pre-trained models from a custom imag…
-
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
paper page: https://huggingface.co/papers/2312.12423
The ability of large language models (LLMs)…
-
### 🐛 Describe the bug
I am trying to instruction tuning Qwen2.5-14B-Instruct with [Liger Kernel](https://github.com/linkedin/Liger-Kernel).
I know that the liger kernel is supported in the dev ve…
-
### Question
Hi,
I want to finetune the model specifically on the the vqav2 dataset and want to compare it to a zero-shot result. However, as I understood, the llava model has already been finetuned…
-
Hey BLIP-2 team,
Thanks for your great work! I've been trying to reproduce the BLIP2 COCO ITM fine-tuning using the resources in your repo:
1. [train.py](https://github.com/salesforce/LAVIS/blob…
-
Hi,
I'm trying to reproduce the results reported on "InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning". But, I'm facing difficulty reproducing the InstructBLIP …
-
### Is your feature request related to a problem? Please describe
I'd like to reduce slow queries as much as possible. Often, slow queries correlate with slow I/O, but the kernel file cache isn't hel…