LumiSpy / lumispy

Luminescence data analysis with HyperSpy.
https://lumispy.org
GNU General Public License v3.0
26 stars 18 forks source link

Add centroid functionality #168

Closed jordiferrero closed 1 year ago

jordiferrero commented 1 year ago

Description of the change

Added a centroid/center of mass functionality to analyze spectra.

Progress of the PR

Minimal example of the new feature

import lumispy as lum
s = lum.signals.LumiSpectrum([[[1,2,3,2,1,0]]*3]*4)
# s --> LumiSpectrum <3,4|6>

ax = s.axes_manager.signal_axes[0]
ax.offset = 200
ax.scale = 100

com = s.centroid()
# com --> LumiSpectrum <3,4|1>
# com.data[0,0] = 400.
codecov-commenter commented 1 year ago

Codecov Report

Base: 100.00% // Head: 100.00% // No change to project coverage :thumbsup:

Coverage data is based on head (cdd3b37) compared to base (8235aae). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #168 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 11 12 +1 Lines 503 545 +42 ========================================= + Hits 503 545 +42 ``` | [Impacted Files](https://codecov.io/gh/LumiSpy/lumispy/pull/168?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LumiSpy) | Coverage Δ | | |---|---|---| | [lumispy/signals/luminescence\_spectrum.py](https://codecov.io/gh/LumiSpy/lumispy/pull/168?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LumiSpy#diff-bHVtaXNweS9zaWduYWxzL2x1bWluZXNjZW5jZV9zcGVjdHJ1bS5weQ==) | `100.00% <100.00%> (ø)` | | | [lumispy/utils/signals.py](https://codecov.io/gh/LumiSpy/lumispy/pull/168?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LumiSpy#diff-bHVtaXNweS91dGlscy9zaWduYWxzLnB5) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LumiSpy). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LumiSpy)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

jlaehne commented 1 year ago

For a cleaner history, could you squash all the fix commits? See https://hyperspy--3064.org.readthedocs.build/en/3064/dev_guide/git.html#keeping-the-git-history-clean

jlaehne commented 1 year ago

Nice functionality! I added a few first comments after having a quick look.

Do you have an idea, why the azure pipelines are failing?

As it is implemented (with a sum), the function works for uniform data axes only! That should be documented/checked. Alternatively, implement a version for non-uniform axes?

jordiferrero commented 1 year ago

I have now added all your feedback. I have included support for any axis (uniform and non-uniform). I have squashed the commits too... had to learn and hopefully I did not destroy anything while doing so. I also tried to fix Azure tests failing but is seems to be a problem not linked to this PR. I will open a new issue on that matter. Let me know what you think now. Thanks

jordiferrero commented 1 year ago

In HyperSpy, there is a find_peak_ohaver and I think that it would be useful to merge it with the function of this PR into a generic find_peak(), where method could be center_of_mass or ohaver?

I am well aware of find_peaks_ohaver(). I don't think centroid and o_haver are similar enough to be merged together. Centroid does not find peaks, it finds the center of mass of a peak, once you already know it exists... In fact, an implementation I wanted to add in the future is to be able to get centroids for spectra that have more than one peak (currently, you need to crop the signal so only one peak exists to get something physical). For this future implementation, peak_o_haver will be used to find the main peaks, then centroids for each peak will be calculated. That's why I don't think these 2 functions should be merged. Also, centroid analysis is very relevant for luminescence studies, that's why I'd prefer it to be in the LumiSpy package.

jlaehne commented 1 year ago

I have included support for any axis (uniform and non-uniform).

I'll try to find some time to test it on some data the next days.

jlaehne commented 1 year ago

Thanks @jordiferrero for this great work. Just tried it on a sample dataset, which is as if it was made for this functionality. And it works both for wavelength and energy axis.

When the Jacobian is deactivated, the results in wavelength and energy range agree, as they should. When the Jacobian is activated, there is a notable shift of the value in energy (when back-converted to wavelength). As for fitting, these kind of routines should be performed in the energy scale! And we should add a note to the user guide (by the way, we still have to write the corresponding paragraph in the fitting chapter of the user guide, it is still a todo-note).

Apart from this note and the minor comments above (mainly about formatting in rest format and docstrings), this PR is ready to be merged in my view. I'm happy to do so once the last things are done.

However, being at it we should also create a corresponding maximum function that finds the maximum of a peak - e.g. by taking the first derivative and then finding zero-crossings.

ericpre commented 1 year ago

In HyperSpy, there is a find_peak_ohaver and I think that it would be useful to merge it with the function of this PR into a generic find_peak(), where method could be center_of_mass or ohaver?

I am well aware of find_peaks_ohaver(). I don't think centroid and o_haver are similar enough to be merged together. Centroid does not find peaks, it finds the center of mass of a peak, once you already know it exists... In fact, an implementation I wanted to add in the future is to be able to get centroids for spectra that have more than one peak (currently, you need to crop the signal so only one peak exists to get something physical). For this future implementation, peak_o_haver will be used to find the main peaks, then centroids for each peak will be calculated. That's why I don't think these 2 functions should be merged. Also, centroid analysis is very relevant for luminescence studies, that's why I'd prefer it to be in the LumiSpy package.

I think both approaches are valid and subjective and in this kind of situation, I would favour the approach which provide the simplest and more consistent API. Once thing that I find inconvenient is to have functions or methods which does similar things because it makes the namespace more busy and quite often it is not obvious to figure out what are the difference between them.

jlaehne commented 1 year ago

I think both approaches are valid and subjective and in this kind of situation, I would favour the approach which provide the simplest and more consistent API. Once thing that I find inconvenient is to have functions or methods which does similar things because it makes the namespace more busy and quite often it is not obvious to figure out what are the difference between them.

I am not so sure about this.

  1. find_peaks1D_ohaver would have to be renamed as the centroid functionality definitely is something different than an OHaver routine.

  2. Both functionalities would take quite different parameters, which makes the docstring much more complicated.

So I tend to opt for the separate implementation.

Edit: One could call it find_peaks1D_centroid though to stress a certain similarity.

ericpre commented 1 year ago

I am fine with both approaches - I am just being annoying to keep namespace and API tidy! :)

jordiferrero commented 1 year ago

I have now implemented the small formatting you requested. I have also moved the documentation to the signals page. I think if you prefer to call this function find_peaks1D_centroid instead of centroid, I am happy to change it to that. I find that a bit misleading as to me this method is not finding peaks... it finds mass density, which does not necessarily relate to peaks but shoulders in emission and asymmetry detection... Let me know what you think, else I am happy for you to do the last review.

jlaehne commented 1 year ago

Thanks @jordiferrero for this nice feature.

I corrected a little more formatting, mainly in the documentation before merging.