Closed hvgazula closed 1 year ago
I think just remove this for now. These lines are hacks that are used to generate-embeddings for gpt2-xl on the full-utterance level. I don't think we need them at the moment.
@VeritasJoker Closing this issue for the time being.
https://github.com/hassonlab/247-pickling/blob/b86d60a5441bf7581b420057b3f1ded6f4eaa051/scripts/tfsemb_download.py#L27-L28
These lists are exclusively used to download models for that particular class. So, it is obvious that gpt* models will fail with MLM. However, I know that this was done to do some checks in
tfsemb_main.py
. I could also try to catch this in the download script itself but I prefer separating this for clarity. For example, looking at the download script, we know what models from each class are being analyzed. The cross-talk (the causal model in the masked model list) is an analysis decision and thus should be separate.