RosettaCommons / protein_generator

Joint sequence and structure generation with RoseTTAFold sequence space diffusion
https://huggingface.co/spaces/merle/PROTEIN_GENERATOR
MIT License
263 stars 43 forks source link

classifier guided diffusion #15

Open tonytu16 opened 1 year ago

tonytu16 commented 1 year ago

In the paper it is mentioned that "An advantage of our approach is that diffusion can be directly guided by function classifiers that operate in sequence space. We first sought to guide the network with the DeepGOPlus Gene Ontology (GO) classifier36 to generate proteins with specific characteristics and functions. Although GO classification scores increased with guidance for nitrogen compound metabolic process (GO:0006807) and membrane (GO:0016020), we found the classifier had a high false positive rate often assigning high scores to native sequences outside the GO domain (Figure S10)."

I wonder where is this classifier guidance implemented? I can't seem to find it in the code. If I want to try other classifiers with your code, where should I add the gradients to? Thanks

0merle0 commented 1 year ago

in the utils folder there is a potentials.py file where all of the potentials are located, you can incorporate your own classifer into here. In the sample generation loop, after the sequence is passed through the model the potential function is queried. If you need more help implementing let me know and I am happy to help!

pgmikhael commented 12 months ago

Hi,

Pseudo-code for this would be helpful as well (something akin to what's in the paper), with regards to sequence-level guidance. Thank you!