AlexanderFabisch / gmr

Gaussian Mixture Regression
https://alexanderfabisch.github.io/gmr/
BSD 3-Clause "New" or "Revised" License
168 stars 49 forks source link

Docstring confusion in mvn.py #44

Closed erigler-usgs closed 2 years ago

erigler-usgs commented 2 years ago

It's possible my thinking is backward, but I've long considered regression from an "imputation of missing values" point of view (e.g., Schneider, 2001; see Sec. 2). Given that, the docstrings for module functions mvn.condition():

https://github.com/AlexanderFabisch/gmr/blob/53185ff92426db14d4e640ee188c2cda679aec03/gmr/mvn.py#L524-L528

...and mvn.regression_coefficients():

https://github.com/AlexanderFabisch/gmr/blob/53185ff92426db14d4e640ee188c2cda679aec03/gmr/mvn.py#L489-L493

...seem confusing to me because you refer to the "input feature" as the feature I would associate with the missing data (dependent variables), and the "output feature" as the feature I would associate with the available data (independent variables).

What's more, it appears you might, at least implicitly, share my view since, when you call mvn.condition() from the MVN.predict() and MVN.condition() methods, you "invert" the indices.

AlexanderFabisch commented 2 years ago

Hi @erigler-usgs ,

I agree, the docstring does not match the implementation:

https://github.com/AlexanderFabisch/gmr/blob/53185ff92426db14d4e640ee188c2cda679aec03/gmr/mvn.py#L546-L548

The shapes of the output will be (n_features1,) and (n_features1, n_features1). Everything else seems to be consistent with your view.

If you are interested you can open a pull request to correct this. Otherwise I will do it.

erigler-usgs commented 2 years ago

Thanks for responding so quickly. I have too many forked git repos floating around my laptop these days, so if you don't mind making the change, I would be grateful. Thanks for maintaining such a useful package; maybe one day I'll be able to contribute something a little more substantial to it.