Closed roesgaard closed 1 year ago
Hello @roesgaard , Saw your comment on my PR and wanted to clarify why the simple approach of it. Also would like to provide some context I just noticed.
First of all, it seems that even tho the PR hasn't been accepted/merged, it's content has been included in the latest rmrm_selection==0.2.8 release on Pypi from last week: https://github.com/smazzanti/mrmr/commit/3ea01aac88628b678aa88f734524594b0edd63d7
Now regarding my PR, I only did include the polars requirement at global level, since it seems that the polars implementation is the one used by default, so if you test if out with the pandas code sample for example, you will see you'll obtain an error saying polars missing https://github.com/smazzanti/mrmr/commit/3ea01aac88628b678aa88f734524594b0edd63d7 (test with 0.2.7 or with 0.2.8 uninstalling polars) which is unexpected.
I do agree that providing extras, as you do in your PR is the right way to do it, but since the code is currently using polars as the default way of computing things, even with extras it should fail too if polars is not provided at global scope. To fix that the way the polars import is handled in the code (since it is included directly in the init.py) should be modified and the code should probably be rearranged managing the polars import inside the methods rather than in the top of the file.
Kind regards.
As a response to the PR from "jose-turintech:patch-2" I have written a PR that adds the different flavors of MRMR as extras (polars, pyspark, bigquery). This should allow for "pip install mrmr_selection[polars] if you need the polars flavor.
Furthermore, in this way you do not have to install the extra packages unless needed, avoiding bloating your environment.
(This is my first official PR so please verify everything. I could not run all tests since I haven't set up pyspark and bigquery.