facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

Two question about ESM_MSA #73

Closed zedzad closed 3 years ago

zedzad commented 3 years ago

Hi Thanks so much for sharing your great job; I have two question; I was wondering if you answer both: 1: After downloading the esm1b and esm-MSA, although I did not have any problem to load the esm1b, I faced the below error when I try to load the esm-MSA:

in esm_msa1_t12_100M_UR50S return load_model_and_alphabet_hub("esm_msa1_t12_100M_UR50S") in load_model_and_alphabet_hub model_data = load_hub_workaround(url) in load_hub_workaround f"{torch.hub.get_dir()}/checkpoints/{fn}", AttributeError: module 'torch.hub' has no attribute 'get_dir'

  1. Also, in the MSA-Transformer paper, are you using 1024 number of seq as input for below section? If yes, how you managed the GPU capacity to do it? In the example of Contact Prediction Examples, you only used 64 and it seems it is OK, but when increasing it to 1024, is there any trick we should use?

    def read_msa(filename: str, nseq: int) -> List[Tuple[str, str]]: """ Reads the first nseq sequences from an MSA file, automatically removes insertions.""" return [(record.description, remove_insertions(str(record.seq))) for record in itertools.islice(SeqIO.parse(filename, "fasta"), nseq)]

Thanks so much