Closed YiningWang2 closed 2 years ago
I ended up including AntiBERTy within the pre-trained weights of IgFold to ensure proper AntiBERTy are inputted into IgFold, so it doesn't actually show up in the code explicitly. You can see the model being called here: https://github.com/Graylab/IgFold/blob/c31ee2ece3d567c2efea07185850a263b628372b/igfold/model/IgFold.py#L141
In case you want to use AntiBERTy as a standalone, it is available on PyPI as you mentioned. The documentation needs more detail, but after loading the model it should behave like a standard HuggingFace BERT model. https://pypi.org/project/antiberty/
Okay,I see. Thanks so much. Very helpful answer!
I ended up including AntiBERTy within the pre-trained weights of IgFold to ensure proper AntiBERTy are inputted into IgFold, so it doesn't actually show up in the code explicitly. You can see the model being called here:
In case you want to use AntiBERTy as a standalone, it is available on PyPI as you mentioned. The documentation needs more detail, but after loading the model it should behave like a standard HuggingFace BERT model. https://pypi.org/project/antiberty/
Hello @jeffreyruffolo, I am trying to fine tune AntiBERTy using the standalone package you mentioned here, but it lacks tokenizer. Do I assume correctly that I can create my own using resources/vocab.txt
from IgFold repo? The encoding ids didn't changed between IgFold and standalone AntiBERTy, did they?
Hello, yes that is correct. AntiBERTy uses the BertTokenizer
from HuggingFace with that vocab file.
Thank you for quick response!
This is an interesting work. I saw you used pretrained model, AntiBERTy to represent the protein sequence and this package had been installed by pip. But I didn't see where it is used in the code. Could you give me any idea?