What do tapt_clm and config0 signify?

Hello, these will soon be removed from the repository. It needs a proper cleanup of unnecessary parts. Esentially, these were some side experiments which did not make it into the main version of the paper but remained in the code. "tapt_clm" refers to "task adaptive pretraining" with causal language modeling which is in essence causal language modeling on the training set before fine-tuning the model on the same training set for a specific task. These experiments had too high deviation in results so we decided not to release the results for them. We tried doing tapt starting from different unmasking (unlocking) configurations and the most common case was config0 where all the layers are masked (configuration 0000 from the paper). I hope this helps! :))

dd1497 / llm-unmasking

What do tapt_clm and config0 signify? #3