YueYANG1996 / LaBo

CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
https://arxiv.org/abs/2211.11158
83 stars 6 forks source link

Custom Data train #3

Closed Sidd1609 closed 1 year ago

Sidd1609 commented 1 year ago

How is it possible to use this model for a custom dataset ? I currently have few classes that I wish to use the model to train and I also have the concepts json file with me using GPT-3.

YueYANG1996 commented 1 year ago

Yes, that's possible. Here is what you need to do:

  1. put your data in the datasets/ folder with the following structure:

    /datasets/{Name of your dataset}/
    - concepts/
    - splits/
    - images/

    You can refer to the other datasets as templates.

  2. Create the config files in the folder /cfg/asso_opt/{Name of your dataset}/

    {Name of your dataset}_base.py
    {Name of your dataset}_{number of shots}_fac.py

    You can follow the template here: https://github.com/YueYANG1996/LaBo/tree/main/cfg/asso_opt/food

Sidd1609 commented 1 year ago

@YueYANG1996 thank you so much, just a follow up I am also looking to modify the Dataloaders for my dataset and modify layers in the bottleneck model, how would I go about it ?

YueYANG1996 commented 1 year ago

You can modify the dataloader here: https://github.com/YueYANG1996/LaBo/blob/main/data.py The model is here: https://github.com/YueYANG1996/LaBo/blob/main/models/asso_opt/asso_opt.py

Sidd1609 commented 1 year ago

@YueYANG1996 thank you so much, just a final question, is it possible to control the dimensions of the class-concept weight matrix ?

YueYANG1996 commented 1 year ago

Yes, we define the dimension of matrix here: https://github.com/YueYANG1996/LaBo/blob/cd0d84bcca1ebc55ea0ee58a056e9c9e3b0cf380/models/asso_opt/asso_opt.py#L12 which is (n. of class, n. of concepts).

Sidd1609 commented 1 year ago

Hi@YueYANG1996, just one more doubt the thing is I am trying to return another value needed for me I edited the code in data.py for DotProductDataset in the getitem function, where will I be able to receive these values in the main.py ?

Another question regarding submodular optimization, is it possible to use the submodular optimization module as a standalone code on list of concepts I have?

YueYANG1996 commented 1 year ago

if you modify the getitem function, you will receive the value in the training/validation loop: https://github.com/YueYANG1996/LaBo/blob/main/models/asso_opt/asso_opt.py#L97

The submodular function can be used separately, you can refer to this script https://github.com/YueYANG1996/LaBo/blob/main/models/select_concept/select_algo.py , which contains the functions we used for submodular optimization.

You just need to provided the required input listed here: https://github.com/YueYANG1996/LaBo/blob/main/models/select_concept/select_algo.py#L127 , then it will return the selected concepts.

charchit7 commented 1 year ago

Hi @YueYANG1996 for some reason while installing the requirements.txt I am getting error. ERROR: Invalid requirement: '_openmp_mutex=4.5=2_kmp_llvm' (from line 5 of requirements.txt) Hint: = is not a valid operator. Did you mean == ?

can you please help here.

YueYANG1996 commented 1 year ago

use conda create --name <env> --file requirement.txt to create the environment.

charchit7 commented 1 year ago

Thanks! @YueYANG1996