lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Apache License 2.0
1.51k stars 120 forks source link

How to get the whisper, clip, and llama model used by macaw? #20

Open chatsci opened 1 year ago

chatsci commented 1 year ago

I used the following code to get the pretrained models:

from transformers import CLIPModel, LlamaModel
clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch16")
from transformers import WhisperForConditionalGeneration
whisper_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")
llama7b_model = LlamaModel.from_pretrained("decapoda-research/llama-7b-hf")
clip_model.save_pretrained('trained_models/clip_model/')
whisper_model.save_pretrained('trained_models/whisper_model/')
llama7b_model.save_pretrained('trained_models/llama7b_model/')

Is this correct?

BinZhu-ece commented 1 year ago

I also want to know how to get the whisper, clip, and llama model used by macaw?

BinZhu-ece commented 1 year ago

I used the following code to get the pretrained models:

from transformers import CLIPModel, LlamaModel
clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch16")
from transformers import WhisperForConditionalGeneration
whisper_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")
llama7b_model = LlamaModel.from_pretrained("decapoda-research/llama-7b-hf")
clip_model.save_pretrained('trained_models/clip_model/')
whisper_model.save_pretrained('trained_models/whisper_model/')
llama7b_model.save_pretrained('trained_models/llama7b_model/')

Is this correct?

Hello, have you run through this code? I encountered the following error:

assert self.head_dim * num_heads == self.embed_dim, "embed_dim must be divisible by num_heads" AssertionError: embed_dim must be divisible by num_heads

satvikgarg27 commented 9 months ago

Hi,

I used the following code to get the pretrained models:

from transformers import CLIPModel, LlamaModel
clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch16")
from transformers import WhisperForConditionalGeneration
whisper_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")
llama7b_model = LlamaModel.from_pretrained("decapoda-research/llama-7b-hf")
clip_model.save_pretrained('trained_models/clip_model/')
whisper_model.save_pretrained('trained_models/whisper_model/')
llama7b_model.save_pretrained('trained_models/llama7b_model/')

Is this correct?

Hello, have you run through this code? I encountered the following error:

assert self.head_dim * num_heads == self.embed_dim, "embed_dim must be divisible by num_heads" AssertionError: embed_dim must be divisible by num_heads

Have U managed to run macaw-llm?

Arbor334 commented 1 month ago

I have the same issue. I found that in run_clm_lls.py files, attention_heads defaults to 220. How did you solve it, please

Cece1031 commented 2 weeks ago

Me,too.Any updates?