microsoft / evodiff

Generation of protein sequences and evolutionary alignments via discrete diffusion models
MIT License
483 stars 67 forks source link

Indexing bugs in TRRMSADataset #23

Closed JonathanDZiegler closed 10 months ago

JonathanDZiegler commented 10 months ago

Great work putting together this project! Two minor indexing errors in the TRRMSADataset class: l256 in data.py: sliced_msa = msa[:, slice_start: slice_start + self.max_seq_len] --> sliced_msa = msa[:, slice_start: slice_start + seq_len]

l274 in data.py: output = sliced_msa[:64] --> output = sliced_msa[:self.n_sequences]