generatebio / chroma

A generative model for programmable protein design
Apache License 2.0
696 stars 90 forks source link

Caption data #1

Closed lhallee closed 1 year ago

lhallee commented 1 year ago

Hello,

Amazing work! Curious if the captions used to train the ProCap will be made available. Best, Logan

aismail3-gnr8 commented 1 year ago

Thanks! The captions used to train ProCap are taken from the descriptive text for PDB structures, as well as UniProt functional comments for individual chains. For more details, see Section T.2 of the Supplemental Information of our paper.