ocean-data-factory-sweden / kso

Notebooks to upload/download marine footage, connect to a citizen science project, train machine learning models and publish marine biological observations.
GNU General Public License v3.0
4 stars 12 forks source link

Issues installing the right requirements #408

Closed victor-wildlife closed 2 weeks ago

victor-wildlife commented 1 month ago

The installation of the requirements.txt doesn't work as expected. Somehow the system checks for "kso-utils 0.2.7" requirements.

Image

I came across this error when trying to download csv files for the template project in Google Colab. It doesn't work because the gdown version installed after running the first cell is 4.6.4 but the gdown version in "requirements_colab.txt" is 5.1.0

jannesgg commented 1 month ago

@victor-wildlife This is part of another issue which is that it is using the packaged version of kso-utils instead of using the kso_utils folder. The best way to get around this for now is to pip uninstall kso_utils and then restart the kernel and re-run that first cell, ensuring that the local version of kso_utils is used rather than the packaged one. The requirements for the package and colab do not currently match and this is likely the reason for these errors.

victor-wildlife commented 4 weeks ago

I know we will always have issues with packages, versions and different environments but I think we should try to reduce the different requirement lists. At the moment we have requirements.txt, requirements_cdn.txt, requirements_colab.txt and the requirements in PyPI. I will reopen this issue as I have given it a go at tidying up the first two cells of the notebooks to (potentially) only use the requirements in PyPI (branch "updated_reqs"). This way we could at least get rid of the requirements_colab.txt?

jannesgg commented 4 weeks ago

@victor-wildlife Sounds good to me. The fewer requirements we have to actively maintain, the better. It could also be useful to bring the PyPI requirements and requirements.txt closer together since they should run in similar environments.

victor-wildlife commented 4 weeks ago

I agree the requirements.txt and PyPI should be the same. What would be the best way to combine them into a single list? I've found there are a few packages missing from PyPI. Would the requirements.txt or PyPI affect Cloudina or any other user that we are aware of? If not, I am happy to update the requirements.txt and pyproject.toml on the "updated_reqs" branch and test the right packages/version combination work on my local computer, Colab, NeSI and Cloudina (ensure the first cell of the tutorials run as expected)?

jannesgg commented 3 weeks ago

@victor-wildlife

The PyPI requirements currently only affect Colab users since there the kso_utils package is installed via pip. On Cloudina, locally etc. the local version of the folder "kso_utils" is used instead.

The requirements.txt file currently affects local users and the Dockerfile. This means that it would ultimately affect binder users as they would use this image, as well as our tests which run off of this image.

We should be careful to ensure that the PyPI requirements do not become too many as they should ideally be lightweight and allow for quick installation of the package via pip, which can be very time-consuming otherwise.

I would say we should first try to match the versions of the packages which are found in both, and then look at the remaining packages and see if they are lightweight enough to just add to the PyPI requirements?

Then testing would just involve making sure the "tests" run as normal when the image is built, that binder can start up without any issues and that it works locally. Does that sound reasonable?

victor-wildlife commented 3 weeks ago

@jannesgg in the updated_reqs branch I am: updating the requirements.txt for local installation, matching requirements.txt and PyPI requirements and updating the first couple of cells of each notebook.

Before I PR all these changes to the dev, I wanted to try changes only to "pyproject.toml". I created #412 in case there are any issues, we can easily revert changes to only one file. Let me know if there any issues