daenuprobst / molzip

The gzip classification method implemented for molecule classification.
MIT License
53 stars 10 forks source link

Extend this to proteins and/or small mol docking? #2

Open tanmoy7989 opened 1 year ago

tanmoy7989 commented 1 year ago

I have two proposals, not sure which ones are within the scope of this: 1) Compare protein language models as well in addition to small mol SMILES?

(or if you want to keep it to small mols) 2) use the gzip representations in the input embeddings for ligand docking ML models to see if they beat SOTA.

Tanmoy Sanyal, PhD Protein design scientist at Novo Nordisk Research Seattle

daenuprobst commented 1 year ago

Hey Tanmmoy,

I think both would be quite interesting. Maybe concatinating ligands (SMILES) and proteins (AA seq) might be interesting?

Cheers, Daniel