-
```
The configure line used in current ubuntu and debian packages of ffmpeg-damnvid
makes the binary unredistributable:
./configure --enable-memalign-hack --enable-libxvid --enable-libx264
--enable…
-
It could be useful to get bounding boxes coordinates from Document Information Extraction task predictions.
on conventional pipeline :
![Screenshot from 2022-09-05 06-33-35](https://user-images.gi…
-
# Description
The idea of this project is build a quantum image processing filter that illustrates the potential which quantum computers have in the processing of large amounts of image data. We'll b…
-
There are two functions in VALOR/optim/misc.py, one is build_optimizer and the other is build_optimizer_for_VQA. Is the second one specifically for the VQA task, while the first one is for other tasks…
-
**Problem Statement:**
Currently, the user has to model quantum workflows manually, which is very complex and error-prone, especially when including advanced techniques, such as warm-starting or circ…
-
Hi! Thank you for your very helpful work!
When I run the `pmc_vqa.sh`, `llava.eval.model_vqa_science.py` is missing.
Looking forward to your reply! Thank you!
lsnls updated
5 months ago
-
File "lib/python3.9/site-packages/transformers/models/idefics2/modeling_idefics2.py", line 190, in forward
position_ids[batch_idx][p_attn_mask.view(-1).cpu()] = pos_ids
RuntimeError: shape mis…
-
Hi there,
I'm trying to set up public datasets for evaluation listed in Table 9, but got different train/test size for some datasets:
1. Facial Emotion Recognition 2013
Dataset I found on [Kaggl…
-
Hello, I load pre-trained llava-llama3 SFT weights and fine-tune using LoRA, but get an error when merging weights:
**scripts:**
Training:
```
deepspeed --master_port=$((RANDOM + 10000)) --inclu…
-
./2.6/ppocr/postprocess)/vqa_token_ser_layoutlm_postprocess.py中:
def _infer(self, preds, segment_offset_ids, ocr_infos):
results = []
for pred, segment_offset_id, ocr_info…