Closed tlc4418 closed 2 months ago
It might be a problem with the config.json
file. I just deleted it on HuggingFace. Would you like to delete the config.json
file and try things again?
I released two versions of the final checkpoints. The Google Drive version does not have a config.json
file while the HuggingFace version has it. I did not receive any issues with the GDrive version checkpoint, thus it's probable that deleting config.json
solves the problem.
By the way, we don't support the .from_pretrained()
method in huggingface. Please use trainer.load()
to load the model, as defined in this file.
Hi @BiEchi, thanks for your inspiring work. Currently, I am confused about how to reproduce your results. When I evaluate with AutoUI-base model it went well. But when I downloaded the JackBAI/general-off2on-digirl
checkpoint from huggingface, changed the "policy model" in scripts/config/main/default.yaml
to its path and copied the tokenizing and generating configs from AutoUI without other adjustment, I got dimension mismatch when loading the model as mentioned in issue #11. Doing as suggested in the issue and adding the information to the config.json fixed the size mismatch problem and I managed to load the model, but got untranslatable actions as issue #14 and this issue #16 mention. And if I remove the config.json
as you say, I will get the does not appear to have a file named config.json
error with T5ForMultimodalGeneration.from_pretrained
function.
For us to reproduce the results, could you detail the steps to change the setting and solve all these wierd issues?(Like what to do with the config, and loading method). That would be really helpful. Thanks!
@StarWalkin thanks for following up on this. does the error only occurs if you use the hugging face checkpoint, or does it still occurs when you use the checkpoint in the google drive link?
@StarWalkin Also, the policy model in the default.yaml
config should still be AutoUI instead of the path to the checkpoint. The path of the checkpoint should be specified in the save_path
item in eval_only.yaml
.
Please let me know whether it works, I'll add a patch to README if it works.
@BiEchi Thanks for ur response. I think some of the people troubled with issue #11, issue #12 and issue #16 may made the same mistake as me. Here is how I worked it out.
ckpts.zip
file from google drive and unzip it.save_path
in eval_only.yaml
to the path to the checkpoint directoryscripts
directory and python run.py --config-path config/main --config-name eval_only
Currently it's running successfully. I haven't got the final success rate but I think it is the right way for that.
Thanks a lot for confirming! I would also suggest to use the huggingface version to run after you got the success rates using the google drive version. I'll reply to every issue after your confirmation.
Just modified the README. I'll leave this issue open for one more week in case you've got any updates. @StarWalkin
Closing as the problem is now solved.
Hi, I had the same issue as @mousewu in Issue #11 , where there were dimensions errors (when loading this checkpoint).
Doing as suggested in the issue and adding the information to the config.json fixed this problem and I managed to load the model. However, I now get the following warnings about both unused and missing weights:
When I try to perform inference using this model, the output is gibberish, as pointed out in Issue #14. I believe this nonsensical generation output might be linked to the above warnings. Maybe due to a mismatch between the model architecture provided in the code, and the model checkpoint you provide in Huggingface? Would you be able to shed some light on this and explain how best to load your final model checkpoint for inference?
Thank you