Hi, I read the paper - but I see the repo is empty of code or models.
Notably, I want to see if your textual pretraining data filtered out cases that appear (or are similar, by BLAST or the like) to any in the TAPE eval set. (e.g. like we did in ProteinBERT https://github.com/nadavbra/protein_bert )
Hi @Amelie-Schreiber, our manuscript is now in submission. We will release the code once it is accepted. Meanwhile, you can check the latest version here.
Hi, I read the paper - but I see the repo is empty of code or models. Notably, I want to see if your textual pretraining data filtered out cases that appear (or are similar, by BLAST or the like) to any in the TAPE eval set. (e.g. like we did in ProteinBERT https://github.com/nadavbra/protein_bert )