lisadunlap / LADS

Official Implementation of LADS (Latent Augmentation using Domain descriptionS)
49 stars 7 forks source link

base.yaml is missing from the mentioned path #15

Closed dacian7 closed 1 year ago

dacian7 commented 1 year ago

To run the baseline, the script is suggested:

python main.py --config configs/Waterbirds/base.yaml

But the base.yaml is missing from dataset folders in configs/. I wonder if the base.yaml corresponds to the LP baseline?

And could you please fix it? Thanks!

lisadunlap commented 1 year ago

you are correct that the base.yaml file corresponds to the mlp baseline. I changed the naming structure so its now renamed as configs/Waterbirds/mlp.yaml. I updated the docs to reflect this, let me know if you run into any more problems!

dacian7 commented 1 year ago

If I want to run the LADS model on DomainNet,

you are correct that the base.yaml file corresponds to the mlp baseline. I changed the naming structure so its now renamed as configs/Waterbirds/mlp.yaml. I updated the docs to reflect this, let me know if you run into any more problems!

@lisadunlap Thanks for you reply!

I still have some questions.

As you mentioned, experiments are on a specific subset of DomainNet. If I want to run, for instance, "real -> sketch" in DomainNet, what I should do?

If I run 'python main.py --config configs/DomainNet/lads.yaml', why are there multiple TEXT_PROMPTS and NEUTRAL_TEXT_PROMPTS from multiple domains in lads.yaml? From my understanding, NEUTRAL_TEXT_PROMPTS (source) should be 'real' and TEXT_PROMPTS (target) should be 'sketch', is it correct?

And in data_helpers.py, we have:

` elif dataset_name == "DomainNet": ...

elif dataset_name == "DomainNetMini": ... `

As you just mentioned, we will actually never set dataset_name == "DomainNet" in 'configs/DomainNet/lads.yaml', right?

Many thanks!

lisadunlap commented 1 year ago

correct! DomainNet is the full dataset where you can choose the source and target domain and DomainNetMini is the split we use in the paper (embeddings in the embeddings folder). I know its a bit confusing, I will try to make that more clear in the README!

lisadunlap commented 1 year ago

as for NEUTRAL_TEXT_PROMPTS and TEXT_PROMPTS, since DomainNetMini has the source domain of Sketch and target domains of clipart, painting, and real our NEUTRAL_TEXT_PROMPT is going to be sketch (I provided a few different prompts that mean sketch but you could put just 1 instead, we just average the text embeddings if there is more than one) and the TEXT_PROMPTS should contain the other 3 domains

dacian7 commented 1 year ago

as for NEUTRAL_TEXT_PROMPTS and TEXT_PROMPTS, since DomainNetMini has the source domain of Sketch and target domains of clipart, painting, and real our NEUTRAL_TEXT_PROMPT is going to be sketch (I provided a few different prompts that mean sketch but you could put just 1 instead, we just average the text embeddings if there is more than one) and the TEXT_PROMPTS should contain the other 3 domains

@lisadunlap Thanks for your clarification! I just tried with 'python main.py --config configs/DomainNet/lads.yaml', while the domain consistency accuracy is 84.3, which is close to the reported 85.44 in the paper, the class consistency accuracy is only 52.3, which is much lower thant the reported 87.26. May I ask for some help? I do not make modifications to hyper parameter settings.

Thanks!

dacian7 commented 1 year ago

I also ran 'python main.py --config configs/DomainNet/mlp.yaml', the case is similar with LADS. The domain consistency accuracy is close to the reported value while the class consistency accuracy is much lower.

lisadunlap commented 1 year ago

A yes, I changed the way we report class consistency to look at the augmented embeddings only for lads but it shouldn't change the mlp metric. I can look into it today, could be a bug crept into the refactor. Is the only metric that's off the class consistency?

dacian7 commented 1 year ago

A yes, I changed the way we report class consistency to look at the augmented embeddings only for lads but it shouldn't change the mlp metric. I can look into it today, could be a bug crept into the refactor. Is the only metric that's off the class consistency?

@lisadunlap

Thanks for your reply!

Actually, the overall test accuracy is much lower (more than 15%) on DomainNet, thus I checked the class consistency accuracy and found it was also wrong. The case is the same with MLP baseline.

However, I just tried CUB dataset and found the result was close to the reported results in the paper.

Is it possible that there is something wrong with the DomainNet dataset? I downloaded it from here: http://ai.bu.edu/M3SDA/. Have you made any modifications with the DomainNet?

Note that: I did experiment with DomainNetMini, but in 53 line in domain_net.py, I changed 'domainnet_sentry_split' to 'domainnet' because ’domainnet_sentry_split‘ is not provided in download dataset. Does it make a difference?

Thanks!

lisadunlap commented 1 year ago

Ah it seems I uploaded the wrong embedding file for domainnet. I uploaded a new file and it seems to work on my end in terms of accuracy and class consistency, let me know if that solves your problem!

dacian7 commented 1 year ago

Ah it seems I uploaded the wrong embedding file for domainnet. I uploaded a new file and it seems to work on my end in terms of accuracy and class consistency, let me know if that solves your problem!

@lisadunlap Thanks for your update. I do not think it is a problem caused by the embedding file as I generated my own DomainNet embeddings. I just tried with your updated embeddings as well, but I got an error (see attached image).

It seems that your uploaded embeddings just contain part of the classes in DomainNet? As I mentioned earlier, I did experiment with DomainNetMini, but in 53 line in domain_net.py, I changed 'domainnet_sentry_split' to 'domainnet' because ’domainnet_sentry_split‘ is not provided in download dataset. Does it make a difference?

Error:

image
lisadunlap commented 1 year ago

Ohhhhhh I see, you probably have the wrong split then, let me upload the sentry_splits. I believe they got rid of some of the examples in the mini version. Is your domainnet on all the classes then?

dacian7 commented 1 year ago

Ohhhhhh I see, you probably have the wrong split then, let me upload the sentry_splits. I believe they got rid of some of the examples in the mini version. Is your domainnet on all the classes then?

@lisadunlap Yes! My domainnet is on all classes. Then that is probably the issue. BTW, why did you get rid of some of the classes?

lisadunlap commented 1 year ago

Okay then that's definitely it, sorry for the mixup! And yeah, the specific split we use is taken from another paper; they use the 40 most common classes and the 4 most common domains and get rid of a lot of the mislabels that were in the original dataset. I have been meaning to try on the entire domainnet though, we used the smaller split because it was faster and a cleaner dataset but LADS should still work on the entire domainnet.

lisadunlap commented 1 year ago

Alrighty I added the sentry splits and modified the domainnet dataset so you wont get any path errors (hopefully)

lisadunlap commented 1 year ago

I'm going to close this but let me know if you run into any other issues!