Open mrd opened 1 month ago
+1
Thanks for your question. We uploaded them at here~
https://huggingface.co/lmms-lab/llava-onevision-projectors/tree/main
Great, it gets a little further. One quick question: what are the expected CUDA mem requirements for fine-tuning? I am trying the 7b model on 4x A100 with 40GB RAM each and it runs out of memory. But should that be sufficient and should I look at tweaking some parameters, or should I try to distribute to even more GPUs because it is not enough at all?
(note: I am able to get the Qwen2-0.5b-based model up and fine-tuning using this set-up - script attached below with example parameters filled out - was able to successfully run using 1x A100 (used about 24GB RAM in my case), though also tried 2x and 4x; all ran within a few minutes for MAX_STEPS=25)
yes, I think at least you need 8*80G to run 7B finetuning.
can you guys help in uploading pre-trained adapter file mm_projector.bin for llama-3.1-8B model ? or if available can one of you share the path pls?
Note that the output of my finetuned 0.5B model is garbage and I haven't been able to find a solution yet. But perhaps this should now be rolled into #155, which seems to be a similar issue with finetuned models.
Hi, related to this issue, I'm having issues loading the pretrained mm project from here. The error occurred on line 108 of the LLaVA-NeXT/llava/model/llava_arch.py
file:
mm_projector_weights = torch.load(pretrain_mm_mlp_adapter, map_location="cpu")
which triggered:
File "...python3.10/site-packages/torch/serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.
I double checked that I have the correct Python (3.10) and torch (2.1.2) versions in my environment. What could be the issue?
Would appreciate any help/pointers -- Thanks!
Hi, related to this issue, I'm having issues loading the pretrained mm project from here. The error occurred on line 108 of the
LLaVA-NeXT/llava/model/llava_arch.py
file:mm_projector_weights = torch.load(pretrain_mm_mlp_adapter, map_location="cpu")
which triggered:File "...python3.10/site-packages/torch/serialization.py", line 1246, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
I double checked that I have the correct Python (3.10) and torch (2.1.2) versions in my environment. What could be the issue?
Would appreciate any help/pointers -- Thanks!
I alse meet some error when this line ends: KeyError: "filename 'storages' not found"
Hi, related to this issue, I'm having issues loading the pretrained mm project from here. The error occurred on line 108 of the
LLaVA-NeXT/llava/model/llava_arch.py
file:mm_projector_weights = torch.load(pretrain_mm_mlp_adapter, map_location="cpu")
which triggered:File "...python3.10/site-packages/torch/serialization.py", line 1246, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
I double checked that I have the correct Python (3.10) and torch (2.1.2) versions in my environment. What could be the issue?
Would appreciate any help/pointers -- Thanks!
issue resolve -- it turned out it was because the mm_projector.bin that got downloaded through hf transformers is just a pointer, and downloading the actual file resolved it.
I am attempting to run the
finetune_onevision.sh
script. I've gotten many things sorted out but I am stumped by the--pretrain_mm_mlp_adapter
argument.The default value as provided in the script is
./checkpoints/projectors/llavanext-openai_clip-vit-large-patch14-336-Qwen_Qwen2-7B-Instruct-mlp2x_gelu-pretrain_blip558k_plain/mm_projector.bin
after expanding the environment variables. I made sure that directory exists but I do not know where to findmm_projector.bin
for the newest LLaVa. I have found an issue and discussion regarding this parameter for the previous version of LLaVa, e.g. https://huggingface.co/liuhaotian/llava-v1.5-13b/blob/main/mm_projector.binI have also looked for some kind of
extract_projector
script but that does not seem to exist.This seems to be something rather important and I cannot find any documentation about it all, apart from the aforementioned Github issues for LLaVa 1.5, even after scouring the web with Google and DuckDuckGo.
I am currently attempting to use the
mm_projector.bin
downloaded from the link above, from the LLaVa 1.5 liuhaotian archive. Update: this has resulted in a series of size/shape mismatch type errors (not surprisingly, really), e.g.size mismatch for 0.weight: copying a param with shape torch.Size([5120, 1024]) from checkpoint, the shape in current model is torch.Size([3584, 1152]).
Please advise.