Closed chatsci closed 1 year ago
For this input example I described above:
{ "instruction": "Is the woman already in the room?", "input": "", "output": "Yes ahe is already in the room", "image": null, "audio": null, "video": "7UPGT.mp4" },
To debug, I printed its model input (feed into macaw, the data_item variable), and it is saved in the following file: https://drive.google.com/file/d/10kJvUA5zvs6PdejTfWu04e102A2ZL13C/view?usp=sharing
Do you think my input is correct, the same as what your model will expect?
Seems I found the problem: I need to remove the pad_token_id and eos_token_id for the data_item["input_ids"]. Thanks.
Hello, I'm also trying to load a pre-trained model, but I couldn't find any relevant information about this in the README file. Could you please share your experience or provide guidance on how to load the pre-trained model? Thanks a lot!
Hello, I tried to load the pre-trained model you provided and run the following example from AVSD data:
Basically, to prepare the whisper model, clip model, and llama model, I used the following:
if name == "main": clip_config = CLIPConfig.from_pretrained('pretrained_models/clip_model/') whisper_config = WhisperConfig.from_pretrained('pretrained_models/whisper_model/') llm_config = AutoConfig.from_pretrained('pretrained_models/llama7b_model/') tokenizer = get_tokenizer("pretrained_models/macaw/", tokenizer_cls=LlamaTokenizer) llm_config.vocab_size = len(tokenizer) print("llm_config: ", llm_config)
I run the model by:
Then I tested the above avsd example. What I get is:
input_texts: ['Below is an instruction that describes a task, with or without input. Write a response that appropriately completes the request.\n\n### Instruction:\nIs the woman already in the room?\n\n### Response:\n\n'] generated_texts: ['\n\n']
So you can see, the output is nonsense. I tried some other examples, and I also tried pure text input, but they results are not satisfying. May I ask what may be wrong?