scikit-learn-contrib / scikit-matter

A collection of scikit-learn compatible utilities that implement methods born out of the materials science and chemistry communities
https://scikit-matter.readthedocs.io/en/v0.2.0/
BSD 3-Clause "New" or "Revised" License
70 stars 18 forks source link

Redesign of DCH to separate interpolator and hull algorithm #147

Closed agoscinski closed 1 year ago

agoscinski commented 1 year ago

discussed with @victorprincipe a possible redesign of GCH in some future to have additional functionalities and more flexibility. At the moment it is just used scorer. It has hard to make a estimator out of the GCH, because to be consistent with the score function we would need to predict X_HD, but in the current fit function X_HD is part of the X input, so it would be an inconsistent in fit and predict function if we predict X_HD.

So we thought about separating the current DCH directional convex hull and the interpolator part into two separated classes and bringing them then together

class DCH(TransformerMixin, BaseEstimator) # maybe ClusterMixin
  def __init__(self, low_dim_idx)
  def fit(self, X_LD, y)
  def predict(self, X_LD) # -> y (linear interpolation on directional convex hull)
  def transform(self, X_LD) # -> X[self.selected_idx_] (directional convex hull vertices)

class Interpolator(BaseEstimator):
  def __init__(self, interpolator_type)
  def fit(self, X_LD, X_HD)
  def predict(self, X_LD) # -> X_HD

# notebook example
dch = DCH().fit(X_LD, y)
interp = Interpolator().fit(dch.transform(X_LD), X_HD)
interp.score(X_LD, X_HD)

I feel like the last part cannot be abstracted and has to be just a recipe in an example notebook because it is such a specific use case.

agoscinski commented 1 year ago

I think this is an abstraction that is not important for people who are interested the DCH. If we need such an abstraction we can revive this issues, but I don't see the necessity. This issue was just to keep track of some ideas in the conception phase of the DCH.