ExaScience / smurff

Bayesian Factorization with Side Information in C++ with Python wrapper
MIT License
70 stars 14 forks source link

centering_io python package #102

Closed tvandera closed 6 years ago

tvandera commented 6 years ago
tvandera commented 6 years ago
ipasechnikov commented 6 years ago

Created centering_io python package using center.py script. Available in commit 2ce280efc7e146cb283a68b43926c226f7274d2b. Also added a few tests. Currently they all pass.

Should we remove writing mean values in mean function? https://github.com/ExaScience/smurff/blob/2ce280efc7e146cb283a68b43926c226f7274d2b/python/centering_io/centering_io/__init__.py#L50-L52

tvandera commented 6 years ago

Add

( centered_and_scaled_m, mean_m, std_m ) = center_and_scale(m, 1)

or

mean_m = mean(m, 1)
std_m = std(m, 1)
centered_and_scale_m = center_and_scale(m, 1, mean_m, std_m)
ipasechnikov commented 6 years ago

Implemented center_and_scale function with pretty much the same signature as sklearn.preprocessing.scale except for copy parameter. We don't have it.

Our version of function returns a tuple (centered_and_scaled_m, mean_m, std_m).

Don't know whether it's a good idea to have such an interface. Seems fine for me. Feel free to propose any ideas.