Critically slow loop in _dipole_correction_init

tyst3273 commented 1 year ago

I don't guess this is a bug, but the loop over 'max_shells' in ForceConstants._dipole_correction_init() is so slow as to be unusable ... is something wrong, or is this expected behavior? I have absolutely no idea whats going inside this part of the code, so I can't try to diagnose it myself! Sorry for my ignorance. (NOTE: you have to use the corrected version of the phonopy.py reader file, see the other issue: https://github.com/pace-neutrons/Euphonic/issues/276

Additional info: My calculation has 1296 atoms. It is from a DFPT calculation with Abinit that I converted into Phonopy force constants using Abipy. The data are read using the 'from_phonopy' method.

The dispersions from Phonopy take several seconds to calculate and are as expected.

See the attachment for the working example. I am using v1.2.0 installed with conda.

test.zip

ajjackson commented 1 year ago

Hi Ty,

I also found slow performance on this for a large system recently and discussed with @krefson . It looks like there is some room for optimisation!

tyst3273 commented 1 year ago

Thanks for the reply. An update: it looks like _dipole_correction_init() is accidentally(?) called twice. Once when the data are read from phonopy and then again when calculate_qpoint_phonon_modes() is called.

On line 1785 in force_constants.py, _dipole_correction_init is called but is assigned to a local variable dipole_data. Later, on line 577, it is checked if ForceConstants has the attribute _dipole_init_data. Since it doesn't (its not defined anywhere else), _dipole_correction_init is called again.

Should _dipole_init_data be assigned the first time _dipole_correction_init is called? Since this methods takes so much time, I am trying to avoid calling it twice by mistake!

Thanks!

ajjackson commented 1 year ago

I'd have to check more closely to see if something is going wrong, but there is a logical reason it would be called twice in this situation: Phonopy uses a different force constants convention (in common with some other codes including Quantum Espresso) from Euphonic (which shares its convention with CASTEP).

In the first convention the long-range contribution is left in the force constants; at the point of performing Fourier interpolation, the long-range components are calculated and subtracted to obtain the "short-ranged" force constants.

In the second convention, the subtraction takes place before writing/storing/exchanging force constants.

So when Euphonic imports force constants from Phonopy, it needs to compute the long-ranged part and subtract to obtain the short-ranged force constants that make a valid, serialisable ForceConstants object. You might write it to a JSON file at this point, so it can't be deferred to later. (Well, except by clever lazy-evaluation magic... but conceptually it is already short-range data.)

Then when we go to perform Fourier interpolation, the long-range dipole model is needed to get the correct dispersion.

That said, It does seem like a good idea to cache the _dipole_correction_init() data on import so that if one does import and then interpolate the data in one Python session (a common use-case) the second init can be avoided. Indeed, there is even an attribute to store it...

krefson commented 1 year ago

Alin suggested you take a look at ScaFaCos http://www.scafacos.de/ which may be what you need in the first instance to see if advanced methods will save time.

Jacob

From: Keith Refson @.> Sent: Tuesday, June 20, 2023 2:19:04 PM To: @. @.> Cc: Perring, Toby (STFC,RAL,ISIS) @.>; Wilkins, Jacob (STFC,RAL,SC) @.***> Subject: Re: [pace-neutrons/Euphonic] Critically slow loop in _dipole_correction_init (Issue #277)

Dear Adam,

Following our discussion this morning I can see at force_constants:py line 1011 where the lambda parameter is set from the 1/6th power of the number of atoms that Becky already coded the automatic determination of the real/reciprocal space sum split. However the effectiveness of this depends on the ratio of compute time of the real- and reciprocal space kernels, and this has not been tuned. This may be done using the 4th argument to dipole_correction_init(), "dipole_parameter" whose consequences for the overall timing might be usefully explored.

Keith

On 19/06/2023 16:16, Adam J. Jackson wrote:

I'd have to check more closely to see if something is going wrong, but there is a logical reason it would be called twice in this situation: Phonopy uses a different force constants convention (in common with some other codes including Quantum Espresso) from Euphonic (which shares its convention with CASTEP).

In the first convention the long-range contribution is left in the force constants; at the point of performing Fourier interpolation, the long-range components are calculated and subtracted to obtain the "short-ranged" force constants.

In the second convention, the subtraction takes place before writing/storing/exchanging force constants.

So when Euphonic imports force constants from Phonopy, it needs to compute the long-range terms to obtain the short-ranged force constants that make a valid, serialisable ForceConstants object. You might write it to a JSON file at this point, so it can't be deferred to later. (Well, except by clever lazy-evaluation magic... but conceptually it is already short-range data.)

Then when we go to perform Fourier interpolation, the long-range dipole model is needed to get the correct dispersion.

That said, It does seem like a good idea to cache the _dipole_correction_init() data on import so that if one does import and then interpolate the data in one Python session (a common use-case) the second init can be avoided. Indeed, there is even an attribute to store it...

— Reply to this email directly, view it on GitHubhttps://github.com/pace-neutrons/Euphonic/issues/277#issuecomment-1597367130, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAGI3ZH4ILT2VT4HJYXO6PTXMBUL5ANCNFSM6AAAAAAZJWA6KM. You are receiving this because you were mentioned.Message ID: @.***>

ajjackson commented 1 year ago

That does look very interesting. A few potential hazards:

the Python bindings don't seem to be documented
epsilon seems to be restricted to a scalar float
not immediately clear if/which methods can provide the second-derivatives we're after; docs indicated that we get energy and sometimes "virials"
only a few methods support non-cubic geometry

pace-neutrons / Euphonic

Critically slow loop in _dipole_correction_init #277