choderalab / pinot

Probabilistic Inference for NOvel Therapeutics
MIT License
15 stars 2 forks source link

dataset representation #101

Open yuanqing-wang opened 4 years ago

yuanqing-wang commented 4 years ago

what features do we want for Dataset object?

now we have the following functionalities

we should consider adding the following:

miretchin commented 4 years ago

Based on our discussion earlier, elaborating on the "different node representation input":

  1. Have dataset object annotate different inputs with different style of representing graphs. i.e., different preprocessing steps that yield different representations, like dataset.smiles_representation or some other representation.
  2. Have datasets be typed. Typed according to what input representation it assumes, and then have a flag that can be passed forward to the model (and by extension, the models should be typed).
dnguyen1196 commented 4 years ago

Potentially, we might want to do something like this paper which the input has both the graph representation + junction tree representation (I digged in the code and it's possible to process molecules into junction trees either on the fly or as part of preprocessing)

https://arxiv.org/pdf/2006.12179.pdf

yuanqing-wang commented 3 years ago

https://docs.google.com/document/d/1Yp4qZ-9U1kPQI3upwDzJrys8Gjr0F92O_8PVrK-UfqU/edit

design doc @miretchin @karalets @dnguyen1196