Generator update - Githubissues

NikoOinonen commented 5 months ago

Update to the machine learning batch generator.

The Hartree generator is modified to work with all of the different force field models in a flexible way.
Cleaned up the generator example scripts.
Some performance upgrades to the force field generation by eliminating some unnecessary device-to-host copies.

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 64.83051% with 83 lines in your changes are missing coverage. Please review.

Project coverage is 50.59%. Comparing base (9492233) to head (6a11956). Report is 2 commits behind head on main.

Files	Patch %	Lines
ppafm/ml/Generator.py	67.66%	43 Missing :warning:
ppafm/ocl/field.py	37.83%	23 Missing :warning:
ppafm/ocl/AFMulator.py	52.77%	17 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #273 +/- ## ========================================== + Coverage 46.44% 50.59% +4.14% ========================================== Files 35 39 +4 Lines 5180 5839 +659 ========================================== + Hits 2406 2954 +548 - Misses 2774 2885 +111 ``` | [Flag](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | Coverage Δ | | |---|---|---| | [python-3.10](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | `50.59% <64.68%> (+4.13%)` | :arrow_up: | | [python-3.11](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | `50.55% <64.68%> (+4.13%)` | :arrow_up: | | [python-3.12](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | `50.55% <64.68%> (+4.13%)` | :arrow_up: | | [python-3.7](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | `50.42% <64.68%> (+4.17%)` | :arrow_up: | | [python-3.9](https://app.codecov.io/gh/Probe-Particle/ppafm/pull/273/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle) | `50.49% <64.68%> (+4.15%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Probe-Particle#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

NikoOinonen commented 5 months ago

@ProkopHapala

When you talk about "performance upgrades" did you measured the improvement? Is it noticeable?

It is quite noticeable. I tested on a bunch of small molecules and got something like 20~40% reduction in force-field generation time.

I'll mention this here as well. I was measuring the timings for the generation process, and found that on the first simulation for each sample, the time to copy the potential and density to the device takes almost as much time as the whole force-field calculation (not an issue if doing multiple rotations for the same molecule). In order to counter this, I was trying to make a system that would load one sample ahead to the device memory asynchronously, so that it would be already there when the next simulation starts.

However, this turned out to be problematic because the enqueue operation in OpenCL is somehow not guaranteed to be fully asynchronous. Specifically, on Nvidia hardware the enqueue seems to take almost as long as the copy operation itself, even when explicitly set as non-blocking. However, on AMD the enqueue seems to return immediately.

Probe-Particle / ppafm

Generator update #273

Codecov Report