Zasder3 / train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
MIT License
653 stars 78 forks source link

Dataset structure #19

Open tarunn2799 opened 3 years ago

tarunn2799 commented 3 years ago

Hi I'm having a little trouble understanding the dataset structure that I should follow in order to be able to train with this package. Is it one parent folder, one folder containing images and one folder containing their text files? If yes, what should these subfolders be named?

rom1504 commented 3 years ago

https://github.com/Zasder3/train-CLIP#training-with-our-datamodule- any folder name should work, the file names should be the same

tarunn2799 commented 3 years ago

Hey, so all images and text files should be in one single folder?

rom1504 commented 3 years ago

No, any subfolder

tarunn2799 commented 3 years ago

Does this work data/images/p1.jpg and data/text/p1.txt

rom1504 commented 3 years ago

Yes

On Thu, Sep 9, 2021, 18:09 Tarun Narayanan @.***> wrote:

Does this work data/images/p1.jpg and data/text/p1.txt

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Zasder3/train-CLIP/issues/19#issuecomment-916238043, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437SLKJEZ3Z2UMH5FAITUBDL2JANCNFSM5DSHHO4A .

tarunn2799 commented 3 years ago

Hi I prepared my dataset in that structure and I ran the below command python train.py --model_name RN50 --folder /data/depop/data_org/clip/data/ --batch_size 512 --gpus 1

I'm getting an AssertionError from the cosine_annealing_warmup package for the line assert warmup_steps < first_cycle_steps

What's happening here? please help me out

tarunn2799 commented 3 years ago

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

singularity014 commented 2 years ago

Hi, the .txt file here contains the a text caption? Lets say I have to create my pair of image and text caption, could you please tell me if assumption below is correct?

so if I have to Finetune the CLIP model on pair of images and captions then this would work?

where,

rom1504 commented 2 years ago

yes I'm surprised how much this is confusing people

singularity014 commented 2 years ago

yes I'm surprised how much this is confusing people

Actually, creating a file per caption(or label) , didn't make much sense to me, hence the question.

bk-201jk commented 2 years ago

@tarunn2799 Hi,I would like to know has this problem been solved.

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

Thanks for your time.

iremonur commented 2 years ago

@tarunn2799 Hi,I would like to know has this problem been solved.

Okay so in models/wrapper.py is the warmup_step hardcoded to 2000? My dataset currently is much smaller for the num_training_steps to be bigger than 2000.

Thanks for your time.

Hi @bk-201jk, I faced the same issue and solved the issue thanks to @ymzhu19eee in the issue #20

bk-201jk commented 2 years ago

@iremonur Thank you very much!And I want to know how many photo in your dataset. And how do you set up your directory structure? What is in txt, or are its contents in the title. I would appreciate it if I could see a set of data in your dataset!!

iremonur commented 2 years ago

I'm planning to prepare a 100k dataset (image-text pairs) for fine-tuning, but first I wanted to see if the code would work by running it with only 3 image-text pairs. The folder structure is as follows: train-CLIP/data/img/1.png train-CLIP/data/caption/1.txt And one of the texts: There is a car on the road.

bk-201jk commented 2 years ago

@iremonur .Thank you very much. If you can run the code with only 3 image-text pairs, please tell me .Thanks again!!