PatBall1 / detectree2

Python package for automatic tree crown delineation based on the Detectron2 implementation of Mask R-CNN
https://patball1.github.io/detectree2/
MIT License
158 stars 39 forks source link

Environment, dependency management and packaging repo. #1

Closed ma595 closed 10 months ago

ma595 commented 2 years ago

We need to choose a environment management tool (virtual env, conda, pipenv), package dependency resolver (conda, pipenv, poetry), and package repository (PyPI, anaconda, etc..).

Currently pip install git+https://github.com/PatBall1/detectree2.git does not work on clusters or systems without the GDAL headers installed (pip install GDAL). We can install GDAL headers using package managers (apt, yum etc) or load modules on clusters (module load GDAL), but there is no guarantee that the location is always in the same place. A typical ubuntu install looks like:

sudo add-apt-repository ppa:ubuntugis/ppa
sudo apt update
sudo apt install libgdal-dev
sudo apt install gdal-bin
export CPLUS_INCLUDE_PATH=/usr/include/gdal
export C_INCLUDE_PATH=/usr/include/gdal
pip install GDAL

Requiring users to manually specify the GDAL location; it might, however, be possible to install development headers from source and include as part of the pip build process?

Alternatively, we can obtain all detectree2 dependencies with Conda (detectron2, GDAL and openCV etc), and then package to conda forge. The environment is specified in an environment.yaml file. The environment is easily reproducible due to the conda-lock file which ensures that dependencies are transitively pinned. Packaging is done using conda-build (meta.yml file), i.e. https://github.com/conda-forge/staged-recipes with conda-build tutorial: https://docs.conda.io/projects/conda-build/en/latest/user-guide/tutorials/building-conda-packages.html. The distribution (archived package) (.whl / .tar) will sit on conda-forge. More on the merits of using Conda here: https://pythonspeed.com/articles/conda-dependency-management/

As another alternative, it is possible to combine Conda with Poetry as shown here: https://stackoverflow.com/questions/70851048/does-it-make-sense-to-use-conda-poetry

Poetry makes packaging easy, and dependency management (using pyproject.toml) is faster than Conda. https://www.youtube.com/watch?v=QX_Nhu1zhlg&t=676s

Neaten packaging by either adopting Conda or investigate Poetry as an alternative.

ma595 commented 2 years ago

poetry + detectron2 not currently possible because of issues with detectron2. https://github.com/python-poetry/poetry/issues/3712#issuecomment-1125407343 https://github.com/python-poetry/poetry/issues/2113 We could install detectron2 with conda and still use poetry. Alternatively it is mentioned at the bottom of this issue that most of detectron features can be used without installing https://github.com/facebookresearch/detectron2/pull/4234

ma595 commented 2 years ago

Development dependencies not yet addressed in environment.yml in matt/conda.

ma595 commented 2 years ago

Could install GDAL from source, but it relies on the existence of a C++11 compiler being available on machine... https://gdal.org/download.html