atcollab / at

Accelerator Toolbox
Apache License 2.0
48 stars 31 forks source link

Frequency Analysis #556

Closed oscarxblanco closed 1 year ago

oscarxblanco commented 1 year ago

Dear all, deall @lfarv , @lcarver , @carmignani , @simoneliuzzo

I have a first version of the frequency analysis function for your review. It does the frequency analysis of a ring using PyNAFF lib and pyat parallel tracking (patpass). The only fix requirement is a ring, and other default values could be set as needed.

The function returns two arrays : first, an array containing [xoffset, yoffset, nux, nuy, log10(|dnu|)/turns] for every particle that survives a given number of turns with a defined frequency; second, it returns particle losses if the lossmap flag was set or an empty array otherwise.

During the integration with pyat I came across with a deprecated use of numpy.Complex inside the PyNAFF library that limits the compatibility with numpy up to version 1.23.5.

This pulls request is related to issues #522 , #552 and #553 .

Best regards, o

function usage example

print(f'FMAP in 4D');
ring.disable_6d();
[xy_tune_nudiff_array, plosses_array] = fmap_parallel_track(ring,  coords=[-8,8,-3,3], steps=[100,100], scale='linear',  lossmap=False, verbose=False)
lcarver commented 1 year ago

Thanks a lot!

Before I dive into it and provide a review. My first comment is about PyNAFF.

We have included within PyAT a python implementation of the SUSSIX algorithm which is called harmonic analysis (written originally by J.Coello de Portugal of CERN around 2016).

You can find the functions in at/pyat/at/physics/harmonic_analysis.py and details are within.

We stripped down some of the functionalities from his original implementation to just provide some basic tune analysis from a provided centroid. Are there any functionalities missing from harmonic analysis that prevent you from replacing PyNAFF with the harmonic analysis module? I think this would be preferable and this means we dont have to have an additional dependency on an external module.

oscarxblanco commented 1 year ago

Good, I will have a look on the harmonic_analysis function by J.Coello de Portugal.

From PyNAFF I use only the return of the fundamental frequency. It is a minimal use of the PyNAFF library, but, I dedided to include it because when it was given an array with a non-well defined frequency it returned an empty array which allow me to simply ignore the result ( using the command 'continue'). Also, although I don't have the results at hand, I have the impression that it was more precise than other libraries. Third, I saw that it had a well defined GPLv3 licence, so, I could modify it and use it if necessary. And also, I did a short performance benchmark wrt to the Julia implementation of PYNAFF ( https://github.com/nkarast/NAFF.jl ) and gave similar speed.

I tried with another implementation, https://pypi.org/project/NAFFlib , but, at the time I had the impression that it was less robust as it sometimes returned errors, and less precise when compared to AT MATLAB results.

So, I chose PyNAFF because of the licence, precision and robustness during the simple tests I did while writing the code.

Best regards, o

lfarv commented 1 year ago

@oscarxblanco : the reasons you give for the choice of PyNAFF make sense, and having a good NAFF implementation available in PyAT may be useful. I have however a question concerning the compatibility with numpy: for now, I think that staying one version behind the last one is not a problem, but that must be a temporary situation. Is the PyNAFF project still active ? it has been sleeping for about 2 years… Can we expect an update solving the "Complex" problem ? I see you raised the question few days ago… The correction looks simple.

Otherwise, we may need in the future to fork another branch, not a very elegant solution…

In the end, whatever the choice between the present "harmonic_analysys" implementation and PyNAFF, we should keep only one.

lfarv commented 1 year ago

@oscarxblanco : I get 88 syntax warnings (trailing backslashes or semicolons) and 317 non-conformities with the PEP8 style guide. Could you do some cleaning ?

oscarxblanco commented 1 year ago

@lfarv Those are a lot of warning ! Yes, I will do the cleaning. Is there any tool that you would suggest to check the code ?

@lfarv With respect to PyNAFF, none of the two developers is active on github, so I don't expect updates on the project. The installation method, using setup.py, might be also deprecated very soon. It seems to me that either PyNAFF sources are adopted by pyat, or I use another library.

If you are already convinced that harmonic_analysis do the task, it would be easier to just change the library. I'll have a look on harmonic_analysis library and get back to you.

lfarv commented 1 year ago

@oscarxblanco:

If you are already convinced that harmonic_analysis do the task, it would be easier to just change the library. I'll have a look on harmonic_analysis library and get back to you.

No, I'm not convinced! Looking at the code, I very much prefer PyNAFF. It's a single module, so if the license allows copying and modifying, I would indeed adopt it. Do you know how to do that in a clean way (PyAT itself is not licensed !)

About warnings and code style: I'm using the PyCharm IDE which gives all information while writing code, but there are many IDE's doing the same (VSCode,...). For a dedicated verification tool, there is flake8 which is used in the GitHub test sequence (you can look at the GitHub test results).

lfarv commented 1 year ago

Another comment: having a module name identical to a function/class name creates problems. Once the function name is imported in the containing package name (at.physics here), the module becomes inaccessible: at.physics.fmap_parallel_track now means the function and hides the at.physics.fmap_parallel_track.py module. This has no consequence anywhere in code, but it has for the Sphinx documentation generator: it cannot document the module, unless we do it manually. So I would prefer you rename one of the two.

oscarxblanco commented 1 year ago

I did a simplified comparison of the speed and precision of PyNAFF and J.Coello de Portugal (JCdP) library. Here is the difference between the returned fundamental frequency and the input frequency, for a variable sample length and two orthogonal signals (sin and cos).

PyNAFFvsJCdP With respect to speed, I get that JCdP is 20 times faster :

  PyNAFF_time: 86.517541764013
  JCdP_time: 3.732474807999097

The behavior with random samples is also different. PyNAFF returns a void array with a warning message to delete de DC component, while JCdP returns zero. If I pass a sin function and I ask for higher order components (should be zero, but, I wanted to see the result), PyNAFF returns a small value, while JCdP returns the fundamental. In JCdP's inplementation, only the 'laskar' method gave a result. The 'fft' gave me only the integer part of the tune and a residual of order O(-1)

For the plot I used two orthogonal signals (sin and cos functions) for a fixed frequency and variable number of samples.

fsin = 54.321123456789987654
fcos  = 12.345678998765432112
signalx = 10e-6*numpy.sin(2*numpy.pi*fsin*nss)
signaly = 20e-6*numpy.cos(2*numpy.pi*fcos*nss)
#and I pass the signals to both libraries, for example, the sin function :
FAsinJCdP[iter_index] = nts*get_tunes_harmonic(signalx,method='laskar', num_harmonics=1, hann=False)
FAsinPyNAFF[iter_index] = nts*PyNAFF.naff(signalx,nts,1,0,False)[0][1]
lfarv commented 1 year ago

Looking at the code of the 2 versions, I think that what is called the 'laskar' method in harmonic_analysis.py is not at all a NAFF method: it simply recursively subtracts the main frequency obtained with fft from the signal and starts again up to the desired number of harmonics. It' simple and fast (your test shows it), but it's still basically an fft. It possibly does not give the accuracy and resistance to noisy data that NAFF is supposed to give…

Your test shows that PyNAFF gives a results 1 to 2 orders of magnitude closer to the true value than JCdP, already with few turns, but does not include noise.

oscarxblanco commented 1 year ago

@lfarv I decided to do again the basic comparison w/o noise for a sin function of 100e-6 amplitude. This time I include also NAFFlib and normalize to the input frequency. PyNAFFvsJCdPvsNAFFlib Adding -56db of uniform noise, equivalent to noise on the 9th bit (.15e-6 amplitude), the response of all modules is similar. PyNAFFvsJCdPvsNAFFlib_m28dB Adding a maximum of -3dB of uniform noise, equivalent to noise in the 2nd bit, (70e-6 amplitude), the response is as follows : PyNAFFvsJCdPvsNAFFlib_m3dB

With respect to speed, I get :

$ python timeit_PyNAFF_JCdP_NAFFlib.py 
  PyNAFF_time: 40.43559245800134
  JCdP_time: 2.5347508350387216
  NAFFlib_time: 4.345918281993363

Beyond that all module have difficulties recovering the frequency and they have different default outputs.

I attach to this message the speed test script in python that I used. timeit_PyNAFF_JCdP_NAFFlib.py.txt

Best regards, o

swhite2401 commented 1 year ago

Dear all,

A comprehensive comparison between algorithms can be found in: https://journals.aps.org/prab/abstract/10.1103/PhysRevAccelBeams.19.054001

In this reference SUSSIX (on what harmonic_analysis) is based and NAFF gives similar results in the absence of noise while with noise all methods are more or less equivalent. It would be good to understand what was stripped down from SUSSIX that explains the loss in precision in the absence of noise.

However, it seems that in the presence of noise (that is the case we are interest in) all the benefit is lost. I would therefore strongly advise to keep the fastest method, a factor 20 is a big deal when you have to analyse large data sets and cannot afford a cluster.

lfarv commented 1 year ago

Very interesting article.

It would be good to understand what was stripped down from SUSSIX that explains the loss in precision in the absence of noise

Looking at the harmonic_analysis code, as long as only the main harmonic is considered, it's a simple fft with an interpolation over the 3 highest peaks (lines 42 - 42). All the rest deals with the following harmonics, which are not used in the two cases we are considering for the moment.

So I agree with @swhite2401, the fastest method is the best choice.

In the future, if we have to extract additional frequencies with small amplitudes compared to the main one, one may raise the question again (and this is not checked in the article).

oscarxblanco commented 1 year ago

@swhite2401 Thank you for the article, it is very ilustrative on the effect of noise in the tune calculation. I, however, doubt this is a good reference to compare with as the results refer to a particular application of the algorithms, and not to numerical precision of the implementation.

In any case, I'll make the changes to implement the frequency map with JCdP's algorithm and have a look on the result.

Best regards, o

swhite2401 commented 1 year ago

Fig. 5 and Fig. 6 are considering simple cosine functions (see Eq. 18 and 19), I think it is very similar to what you have done.

@lcarver is the hanning window applied in harmonic_analysis? If not, this may explain some of the discrepancies and we may consider to introduce it, it is quite simple.

oscarxblanco commented 1 year ago

@swhite2401 Yes, it is similar. And it is the result when considering noise.

swhite2401 commented 1 year ago

@oscarxblanco the option hann is there in harmonic_analysis but set to False by default. It could be interesting to evaluate its impact on the resolution.

swhite2401 commented 1 year ago

Yes, it is similar. And it is the result when considering noise.

Ah ok, got it! In fact there is no data for the case without noise, sorry I scrolled too quickly -> this just confirms that in the presence of noise both algorithms are equivalent.

oscarxblanco commented 1 year ago

@swhite2401 I made a test with the following call setting hann=True and hann=False, but, the result is the same.

from harmonic_analysis import get_tunes_harmonic
get_tunes_harmonic(xReadings, method='laskar', num_harmonics=1, hann=True)
swhite2401 commented 1 year ago

I made a test with the following call setting hann=True and hann=False, but, the result is the same.

Ok thanks!

lfarv commented 1 year ago

@oscarxblanco , @swhite2401

I made a test with the following call setting hann=True and hann=False, but, the result is the same.

Of course, because whatever you put in hann, no window is applied with the so-called 'laskar' method. Which is by the way not at all related to Laskar, it's a simple fft !

But indeed, windowing should improve the results. This code should be rewritten, it's poorly written…

swhite2401 commented 1 year ago

But indeed, windowing should improve the results. This code should be rewritten, it's poorly written…

Maybe it should but I am not sure anyone would be very motivated to do it..... unless you would like to attack this? On the other hand properly implementing the hanning window in all modes seems like a relatively simple upgrade.

lfarv commented 1 year ago

properly implementing the hanning window in all modes seems like a relatively simple upgrade.

When restricted to the main harmonic, it's true. When more harmonics are requested (default is 20), it's not: this is done by subtracting the previously found harmonic from the signal. If a window is applied, it's complicated.

For now, we could take harmonic_analysis as it is (no window): it's fast and reasonably accurate. That means modify fmap_parallel_track.py to use get_tunes_harmonic and remove the dependency to PyNAFF.

Later, one can still upgrade get_tunes_harmonic to implement windowing or other methods, it will be a single function modify.

Splitting in 2 steps allows this PR to be merged so that frequency maps are operational, and leave the possibility to improve the tune detection later.

swhite2401 commented 1 year ago

When more harmonics are requested (default is 20), it's not

Well you may use the raw signal for the subtractions and apply the windowing only to get the harmonic frequency/phase/amplitude. Anyway I agree that this does not belong to this PR.

oscarxblanco commented 1 year ago

Dear all, as a last qualitative example I leave here an image of two diffusion maps over a lattice. The lost in numerical precision reduces the clarity in some traces. Screenshot from 2023-02-15 16-44-19

This time I won't be able to provide the lattice I used, but here is the call to the function :

[xy_tune_nudiff_array, plosses_array] = \
    fmap_parallel_track(ring, \
    coords=[-8,8,-3,3], \
    steps=[200,200], \
    scale='linear', \
    lossmap=True,
    verbose=False)

Best regards, o

oscarxblanco commented 1 year ago

Another comment: having a module name identical to a function/class name creates problems. Once the function name is imported in the containing package name (at.physics here), the module becomes inaccessible: at.physics.fmap_parallel_track now means the function and hides the at.physics.fmap_parallel_track.py module. This has no consequence anywhere in code, but it has for the Sphinx documentation generator: it cannot document the module, unless we do it manually. So I would prefer you rename one of the two.

Dear @lfarv , are there any suggested names for the python file and function ?

Best regards, o

lfarv commented 1 year ago

@swhite2401:

Well you may use the raw signal for the subtractions and apply the windowing only to get the harmonic frequency/phase/amplitude. Anyway I agree that this does not belong to this PR.

I suspect that that amplitude you get is affected by the windowing so that you cannot use it on raw data, but I may be wrong…

@oscarxblanco: Very nice plots ! why not using frequency_maps.py for the module and keep you function name? Unless someone has a better idea… Otherwise, this branch is ok for me.

oscarxblanco commented 1 year ago

Dear @lfarv ,

the filename has been changed to frequency_maps.py in the last commits.

Best regards, o

swhite2401 commented 1 year ago

Hello @oscarxblanco, following some remarks by @simoneliuzzo I am fixing a few things in the tune calculation from tracking. In particular when the particle used for the fft was lost (i.e. nan coordinates) the function get_tunes_harmonic was crashing with a ValueError. Instead the function get_tunes_harmonic will now return nan when the tracking fails. Is this ok with your implementation? Any suggestion on your side?

oscarxblanco commented 1 year ago

@swhite2401 I checked the modifications and they have no impact on this branch.

I took the oportunity to add two more values to the output : dnux, dnuy. So, to recap,

The function returns two arrays :

Best regards, o