Closed bimsarapathiraja closed 1 month ago
Hi,
Thank you for your interest in our work. Due to the nature of the calculated contrastive loss, having a necessary amount of positive and negative pairs is one thing with high priority. With the changes you have with the config file, I suggest to change the variable "tokens_per_iter = 2", which is low for the loss function that we design. For the contrastive loss to be effective, the positive and negative sums should have enough pairs to pass meaningful gradients. However, I will be investigating the issue with more detail and will provide a fix if another issue is the case.
Thanks, Yusuf
Hey I ran with the exact same configs and still ran into the same problem. So I installed all the libraries in your .yaml file with exact same versions. Previously I used one of my already installed envs which was able to run the train.py without any problem except for the issue mentioned above.
I was able to successfully run the code without any problem using the env you have provided.
I ran the given code with same accelerator config and base_config files with some minor changes. I created a folder and added 100 images as the data directory.
Only modified configs in the file are:
data dir, placeholder_token_count, tokens_per_iter and xformers
My base_config file is as follows:
I get the following error in the third iteration of the first epoch.
What could be the reason and how to avoid this error?