nubank / fklearn

fklearn: Functional Machine Learning
Apache License 2.0
1.51k stars 165 forks source link

Improve fklearn build process for local testing/development #196

Open erickisos opened 2 years ago

erickisos commented 2 years ago

My Computer

Context

Hey folks, recently I tried to install fklearn on my personal computer and realized that the main installation is not pretty straightforward. I will add my detailed steps but it would be great to know wdyt about migrating some libraries in order to improve this process, the main error that I'm still getting is this one:

error: legacy-install-failure

Instructions

At the last point I started having the error legacy-install-failure with numpy, so I tried to install first numpy using conda with:

conda install numpy

Fortunately, conda installed all the missing dependencies (like the BLAS libraries), however, when I tried to re-run the pip install -e ".[all]" I received the exact same error with numpy, so after reviewing the content of the trace, I realized that the version installed by conda was different, conda installed numpy==1.22 (which seems to be valid according to the requirements file) but the main pip process was trying to install numpy==1.18 and it seems like, that version didn't include the bdist/wheels needed, or was unable to build'em because some of the other libraries (I'm not sure about which ones) are using a pyproject.toml definition now.

So the next step was to retry installing that specific version using conda: conda install 'numpy=1.18', which helped me avoid the problem with numpy. In the next iteration I had the same error with scikit-learn so the process was the same, check if the expected version was in the conda-forge repo and after that just run the install command.

Expected behavior

It would be great if at some point we are able to just run a `pip install -e ".[all]"' to start testing locally the library, maybe this is just a problem related to my computer, but I want to be sure that this is not a problem for someone else. Maybe the M1 chip is not supported yet, and maybe we are in the process to start supporting that chip (without using rosetta?)

Possible solutions

As I saw, it seems like there are some libraries that could be installed without any problem directly with conda, so in my understanding, it means that we can find a way to do that directly without having to deal with extra iterations for the compiled packages that we use.

Thanks in advance!

erickisos commented 2 years ago

After checking the full process, the main libraries with issues were:

Given that information, it seems like this is related to #136 and maybe #135