gnina / libmolgrid

Comprehensive library for fast, GPU accelerated molecular gridding for deep learning workflows
https://gnina.github.io/libmolgrid/
Apache License 2.0
141 stars 47 forks source link

Operating on pairs of inputs #75

Closed JoshuaMeyers closed 2 years ago

JoshuaMeyers commented 2 years ago

Hey Guys, nice library! I've enjoyed using libmolgrid to train over protein-ligand binding affinities but I'd now like to operate over pairs of input PDBs (where each pair is assigned a single label). Do you have any pointers on how one might achieve this with libmolgrid while still making use of the structure cache – I am working with pytorch. Thanks in advance

dkoes commented 2 years ago

You would have two labels and two structures in your input types file, like this: 0 1 in1.pdb in2.pdb

provider = molgrid.ExampleProvider()
provider.populate("in.types")
ex = provider.next()
list(ex.labels) # [0.0, 1.0]
len(ex.coord_sets) # 2
JoshuaMeyers commented 2 years ago

Thanks! That's much easier than I thought, is there a link to some documentation on the 'types' file?

dkoes commented 2 years ago

It's explained in the paper (https://arxiv.org/pdf/1912.04822.pdf), but you're right that the explanation should be in the online documentation.