sustainable-processes / ORDerly

Chemical reaction data & benchmarks. Extraction and cleaning of data from Open Reaction Database (ORD)
MIT License
66 stars 7 forks source link

Enable on-the-fly fingeprrint generation #96

Closed marcosfelt closed 1 year ago

marcosfelt commented 1 year ago

When using 16k size fingerprints for reaction condition prediction, we easily will saturate RAM if we load the whole dataset. Therefore, we want to generate fingerprints on the fly.

The idea is to adapt the data generator from this blog post to our case.