benchopt / benchmark_lasso_path

Benchopt benchmark for Lasso path
4 stars 4 forks source link

feat!: add preprocessing step in get_data #18

Closed jolars closed 2 years ago

jolars commented 2 years ago

This PR adds a data preprocessing step for all (current) data sets. Sparse datasets are scaled by the maximum absolute value and dense datasets are centered my mean and scaled by standard deviation.

jolars commented 2 years ago

Nice! I think we should document this use case in the benchopt doc. Maybe we could start a sort of "gallery" where we put pointer on simple code that use a certain feature, such as this one for the utils folder?

That's not a bad idea. Maybe this doesn't quite fit the bill, but I guess we could consider having a contrib folder in the repo too?

Why do you put this in the object? I it is not necessary, I would advise against it, so it does not mess up with hashing and serialization (it should no but we are never too careful :) )

I don't know actually... I copied it from somewhere else :) I'll fix it!

Thanks for the review(s)!

tomMoral commented 2 years ago

Not sure what is the meaning of a contrib folder?

jolars commented 2 years ago

Well, in my experience the contrib folder is a standard way of including useful code that's been contributed by someone but isn't deemed important enough to include as main functionality in the package (and is not maintained by the project). That seems to apply quite well here, no?