tokenpacker Search Results

CircleRadon/TokenPacker #11

Unable to reproduce benchmark results on LLaVA-v1.5-7b

Hi, Thank you for your nice work. I have a trouble in replicating the benchmark result like this: - "paper-Tokenpacker" is the papers result - "open-checkpoint" uses the checkpoint of…

ZQ1102118381 updated 1 week ago

CircleRadon/TokenPacker #10

load model issue (about vision tower)

In sunshine-lwt/TokenPacker-7b-144token, use openai/clip-vit-large-patch14-336, when I load model, appear : ![image](https://github.com/user-attachments/assets/c2bcf2fe-5745-44c7-910a-2713cc7e4a53) …

Eric-is-good updated 3 weeks ago

swordlidev/Efficient-Multimodal-LLMs-Survey #4

Excellent survey! Adding a new work.

Thanks for the excellent survey! Would you like include a new work: TokenPacker: Efficient Visual Projector for Multimodal LLM. Paper: https://arxiv.org/abs/2407.02392 Code: https://github.com/…

LiWentomng updated 1 month ago

CircleRadon/TokenPacker #12

复现hd模型失败，

我正在试图在mini-gemini的预训练和指令微调数据集上复现TokenPacker-HD-7b-9patch-144token，但是没能得到比较满意的结果。我得到的pretrain阶段的loss曲线如下 ![image](https://github.com/user-attachments/assets/9ccb2333-ccb3-4edd-94c7-52b97c148d7c) inst…

yangte3518880 updated 1 week ago

CircleRadon/TokenPacker #9

really like mini-gemini

Birdylx updated 1 month ago

CircleRadon/TokenPacker #3

report bugs

https://github.com/CircleRadon/TokenPacker/blob/305ce146ec8b6d8b5ec4959f6cac699e7c8b9ed4/llava/train/train.py#L320 There is a comment on Line 321 which makes the indented block wrong. Maybe there s…

Gaffey updated 1 month ago

CircleRadon/TokenPacker #6

Thanks for your great job！ I’m quite curious about the performance comparison between TokenPacker and Average Pooling, because from my experience, the Pooling method converges faster and achieves bet…

johncaged updated 1 month ago

CircleRadon/TokenPacker #8

Comparison with TextHawk

Hello, I’m the author of TextHawk. I have noticed that TokenPacker shares the same starting points as TextHawk, including token compression and multi-level features. It's great to see more people gett…

yuyq96 updated 1 month ago

Yangyi-Chen/SOLO #3

[question] any plan to add multi images reasoning? or even m…

Hi authors of this great project! Fuyu-8 is great for it's flexibility to accept any aspect ratio or resolution great for UI understanding, but we don't know how it's trained. but now we got SOLO! …

eisneim updated 1 month ago

CircleRadon/TokenPacker #1

Training dataset

Hi authors, Are the experimental results reported in Table 1 trained using miniGemini's training sample data? Or did you only use CC3M and 656K SFT data consistent with LLaVA-1.5. Thanks.

Yxxxb updated 2 months ago

10 results
for tokenpacker