bioinf-jku / FCD

Fréchet ChemNet Distance: A quality measure for generative models for molecules
GNU Lesser General Public License v3.0
71 stars 26 forks source link

Reduce memory footprint #16

Closed hogru closed 6 months ago

hogru commented 6 months ago

Context

I used the new version (1.2) of fcd to calculate the FCD of > 1M molecules.

Issue

In this scenario, my corresponding python script crashed due to a large memory footprint (> 50 GB) on both macOS and Linux. This might be a local issue and depend on the specific pytorch version, among other things.

Suggested resolution

I amended the code in two ways:

(1) Used a different context manager for inference; this does not solve the issue, but was done in addition to ... (2) Casting the inference result to numpy float32 reduced the memory footprint. I am not sure why this works since the data type without the corresponding line is already float32 and should be without any additional data, such as gradients.

I calculated the FCD for smaller molecule sets of size 100,000 to check whether the FCD value remains the same, which it did in my experiments.

In summary, I consider this to be a minor change which helps to alleviate memory problems, at least in certain configurations.

renzph commented 6 months ago

Hey Stephan. Thank you so much for your input.

I changed this in the new version at https://github.com/bioinf-jku/FCD/blob/f806d583cbf3f2ff0f0843f23813a4f053535404/fcd/fcd.py#L79