usnistgov / pyMCR

pyMCR: Multivariate Curve Resolution for Python
https://pages.nist.gov/pyMCR
Other
80 stars 27 forks source link

Sklearn-like functionality and HyperSpy compatibility #33

Closed CCampJr closed 2 years ago

CCampJr commented 4 years ago

This makes the API more similar to sklearn's as discussed in hyperspy/hyperspy#2172 (comment).

I have not yet address the comment below from @francisco-dlp in PR #32

Finally in this PR there are a couple of changes that have nothing to do with sklearn, but makes integration with hyperspy a lot easier. In particular, the automatic unfolding of the C and ST matrices and the internal call to np.asanyarray.

tjof2 commented 3 years ago

What's the status of this?

CCampJr commented 3 years ago

What's the status of this?

Just responded in the HyperSpy PR thread. Basically, I can get that pushed ASAP once you all are confident it will solve the problem with integration.

Thanks!

francisco-dlp commented 3 years ago

It looks good except for one small detail that currently raises an error:

>>> s.decomposition(algorithm=McrAR(fit_kwargs={'ST':s.get_bss_factors()}))                
INFO:hyperspy.learn.mva:Performing decomposition analysis
INFO:hyperspy.learn.mva:Undoing data pre-treatments
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-b37c44b9db40> in <module>()
----> 1 s.decomposition(algorithm=McrAR(fit_kwargs={'ST':s.get_bss_factors()}))

/home/francisco/Git/hyperspy/hyperspy/learn/mva.py in decomposition(self, normalize_poissonian_noise, algorithm, output_dimension, centre, auto_transpose, navigation_mask, signal_mask, var_array, var_func, reproject, return_info, print_info, svd_solver, copy, **kwargs)
    499             elif is_sklearn_like:
    500                 if hasattr(estim, "fit_transform"):
--> 501                     loadings = estim.fit_transform(data_)
    502                 elif hasattr(estim, "fit") and hasattr(estim, "transform"):
    503                     estim.fit(data_)

/home/francisco/Git/pyMCR/pymcr/mcr.py in fit_transform(self, D, **kwargs)
    572         """
    573 
--> 574         self.fit(D, **kwargs)
    575 
    576         return self.C_

/home/francisco/Git/pyMCR/pymcr/mcr.py in fit(self, D, C, ST, st_fix, c_fix, c_first, verbose, post_iter_fcn, post_half_fcn)
    392                 if self._ismin_err(err_temp):
    393                     self.C_opt_ = 1 * C_temp
--> 394                     self.ST_opt_ = 1 * self.ST_
    395                     self.n_iter_opt = num + 1
    396                     self.n_above_min = 0

TypeError: unsupported operand type(s) for *: 'int' and 'Signal1D'

There is an easy fix for this: hyperspy signals can be multiplied by numbers on the right but not on the left so, to fix it it would suffice to replace things like:

self.C_opt_ = 1 * C_temp

with

self.C_opt_ = C_temp * 1

Is that feasible @CCampJr ?

ericpre commented 2 years ago

@CCampJr, any change to address @francisco-dlp's comment above? Would it help to send a pull request to https://github.com/CCampJr/pyMCR/tree/sklearn_compat?

ericpre commented 2 years ago

Thanks @CCampJr for merging this PR. I have open https://github.com/usnistgov/pyMCR/pull/39 to fix the issue mentioned by @francisco-dlp above.