Closed SeanCraven314 closed 2 months ago
Thanks for your commit. I add the lines you changed. I think line 282 is redundant. And I still encounter the keyword tensor shape error. `['./demo_images/av.png']
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards: 25%|██▌ | 1/4 [01:04<03:14, 64.96s/it]
Loading checkpoint shards: 50%|█████ | 2/4 [02:11<02:12, 66.08s/it]
Loading checkpoint shards: 75%|███████▌ | 3/4 [03:19<01:06, 66.63s/it]
Loading checkpoint shards: 100%|██████████| 4/4 [03:26<00:00, 43.39s/it]
Loading checkpoint shards: 100%|██████████| 4/4 [03:26<00:00, 51.72s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask
to obtain reliable results.
Setting pad_token_id
to eos_token_id
:128001 for open-end generation.
input: --conv-mode
is vicuna_v1, using vicuna_v1
torch.Size([1, 3, 384, 384])
Traceback (most recent call last):
File "/home/deping.zhang/code/llm/VILA/run_vila.py", line 153, in
Hi, thanks for your great work.
As in issue #39, I also encountered the same error: a small tensor dimension error. I added some logic to perform broadcasting, which solved the issue for me.
I haven't spent much time on this, and it hasn't been tested with all the model weight permutations. I am happy to do this if needed!
Regards,
Sean