Open shijiangming1 opened 1 month ago
Thank you for your reply. I am very much looking forward to the official way of splitting the dataset into training and test.
Hello @shijiangming1,
Have a look on this. Here you will find a proposed way of splitting the PatternCom dataset based on the examples. The idea is, for example, for the attribute shape
and for the category swimming pools
, to split using 80% of oval
, rectangular
, etc. for training, the rest 10% and 10% for validation and testing. This is closely related to the way the CIRR dataset was split (in Computer Vision).
Moreover, if this split is not suitable for your specific application, please free to propose another way of splitting, inform us about it and we will also put it here on our GitHub repo. That is, maybe another split is more suitable for your needs. I have also included a python script in GDrive to split the CSVs, modify accordingly.
Hope this helped. Keep me posted and go on with the great work you're doing on our dataset :)
Hello @shijiangming1,
The PatternCom dataset is used for zero-shot composed image retrieval in the paper, i.e. is used only for testing. Our method is training-free on top of pretrained VLMs and we use the dataset only for testing.
However, the dataset can be easily used for training a new method, if this is your idea. I will asap come back with a proposed official way of splitting the dataset in training and test.
Thanks a lot 🙏🏽