single site mutation data

Hi @xrz14, sorry for the slow reply. I don't check this repo much anymore and for some reason I didn't get an email from GitHub telling me you'd raised an issue.

This may not be relevant for you anymore, but I would include the single mutation data. Note that if you primarily have single mutation data, we would expect the embeddings from the MSA transformer to be the most useful encoding method. Something like Georgiev and onehot won't have any chance of transferring information from one site to another. That being said, in my experience, the ML methods used in the MLDE package are no more effective than a naive recombinatorial approach (i.e., just make a variant with all your positive single site mutations and that one is probably going to work well).

fhalab / MLDE

single site mutation data #8