lamalab-org / chem-caption

MIT License
2 stars 0 forks source link

translation between representation #27

Open kjappelbaum opened 1 year ago

kjappelbaum commented 1 year ago

needs a bit of thinking for the best design

but, in principle, we can make many pre-training tasks by translating between representations

could also be the question: are X and Y the same molecule? Where X and Y are in different representations (or randomized SMILES)

Arkhymadhe commented 1 year ago

Need more clarity here. Are you talking of featurizer that converts from one representation to another (e.g., SELFIES to SMILES) or talking of a featurizer that takes in two molecular representations and tells if they represent the same molecule?

Arkhymadhe commented 1 year ago

could also be the question: are X and Y the same molecule? Where X and Y are in different representations (or randomized SMILES)

I think the Comparator API may be of use here.

kjappelbaum commented 1 year ago

yeah, so the basic thing might be solved with a comparator. I think the question here is rather where the sampling will be implemented.

Arkhymadhe commented 1 year ago

yeah, so the basic thing might be solved with a comparator. I think the question here is rather where the sampling will be implemented.

Could you provide more expatiate on this?

kjappelbaum commented 12 months ago

Could you provide more expatiate on this?

you will only get a meaningful signal if you run this on a set of molecules. And then you have the questions:

kjappelbaum commented 8 months ago

could we build something based on the Comparator API?

Arkhymadhe commented 8 months ago

could we build something based on the Comparator API?

Yes, I'm sure we can. The Comparator exists already, so the harder part is done. All I'd need to figure out is the basis of the comparison.

Arkhymadhe commented 8 months ago

I think I have a basic solution worked out for this. Expect a PR tomorrow.