Closed LiamTTT closed 1 year ago
Hi, thanks for your interests. I not quite clear about what dose "resized token embeding" means. Could you please refer the corresponding code using a link?
oh, sorry. I mean why 'clip_embedding_tokens' or 'blip_embedding_tokens' is different when dataset change.
https://github.com/Flash-321/ARLDM/blob/94725e13e7c790ec1025fd5d485771becc367f02/config.yaml#L32-L56 https://github.com/Flash-321/ARLDM/blob/94725e13e7c790ec1025fd5d485771becc367f02/main.py#L119-L121
Thanks for your comments! this is because we added some new tokens for characters in these two datasets.
https://github.com/Flash-321/ARLDM/blob/34b30703a2caeeb2364bdfb161345027217785c6/config.yaml#L35
https://github.com/Flash-321/ARLDM/blob/34b30703a2caeeb2364bdfb161345027217785c6/config.yaml#L42
https://github.com/Flash-321/ARLDM/blob/34b30703a2caeeb2364bdfb161345027217785c6/datasets/flintstones.py#L36-L39
https://github.com/Flash-321/ARLDM/blob/34b30703a2caeeb2364bdfb161345027217785c6/datasets/pororo.py#L37-L40
as a result, the vocab size of tokenizer has been changed, and we need to resize the token embeddings to make sure the embedding layer can still encode the sentence. And the num can be obtained by printing len(clip_tokenizer)
and len(blip_tokenizer)
after adding those new tokens.
Got it! Thanks! That is a fantastic work! Looking forward to more work from you.
Thanks! Feel free to open an issue if you have any further questions!
Hi!
I am reproducing this work, and I noticed that the token embeding is resized when training on pororo or flintstone datasets. My question is:
BTW, thanks for your opensource! look forward to your reply :)