generatebio / chroma

A generative model for programmable protein design
Apache License 2.0
659 stars 84 forks source link

Substructure Conditioner - Supplementary Appendix O #39

Closed jlotthammer closed 8 months ago

jlotthammer commented 8 months ago

Hi,

Fantastic work - I've had a lot of fun playing with different design principles enabled by Chroma.

While reviewing the chroma paper I saw Supplementary Appendix O details a particular Substructure conditioner that suggests a specific protocol/potential for inter-residues distance restraints. Is this conditioner available [l looked around, but didn't easily find it] or would it need to be derived from the higher-level substructure conditioner? If so, any code resources or advice in the right direction would be appreciated!

wujiewang commented 8 months ago

Hi! Thanks for the question.

That conditioner is a research concept for completeness, and we didn't include an implementation or benchmarking it. It is essentially just solve for the distribution between two residues given a noise level.

If you want to impose inter residue distance, you can probably get some luck by including a harmonic restraint in a custom Conditioner.

jlotthammer commented 8 months ago

Thanks so much for your quick response. Yeah, my naive approach has been to impose multiple harmonic restraints with a custom conditioner, but I stumbled upon that SI appendix O and figured I'd ask if this approach [multiple harmonics] was the right direction or if there was something more involved [SI?] that may be advantageous over harmonic restraints. It sounds like that is not known to be the case?

wujiewang commented 8 months ago

you are right. For the task for restraining pair distances, we don't know what works better in practice.

Appendix O attempted to address the research question of getting sampler to approximately sample $p(x0 | d^{ij} = d{target})$ using the classifier guidance formulation, and provides a non-paramteric solution for the conditional scorer (with some assumptions). And as you see the implementation might be a little involved.

For the [multiple harmonics], it certainly will bias the sampling, but we are not sure what distribution it will sample, and it certainly does not sample $p(x0 | d^{ij} = d{target})$.

wujiewang commented 8 months ago

close this due to inactivity, feel free to reopen.