NCATSTranslator / reasoner-validator

Validation of Translator OpenAPI (TRAPI) messages both to TRAPI and Biolink Model standards. See https://ncatstranslator.github.io/reasoner-validator/
Other
2 stars 4 forks source link

Migrate from poetry to hatch #54

Closed vemonet closed 11 months ago

vemonet commented 1 year ago

Hi @cmungall @sierra-moxon @kennethmorton, I discussed with @RichardBruskiewich about the current migration of the package to poetry and he told me to propose my changes here

I must admit that I am not a big fan of poetry when it comes to building pip packages... I only use it for applications and services (e.g. our Translator APIs)

A few months ago I searched and experimented multiple "modern" solutions for python packaging. There are so many different tools I was getting a bit lost, and wanted to migrate from setup.py to pyproject.toml, and have a modern workflow for development, scripts and environment handling

⚠️ The issue with poetry

Like you my first thought was to try using poetry , the tool is popular and nice, but there are also a few caveats:

In the end, I currently only use poetry for applications and services (e.g. our Translator APIs), because the dependency management can be helpful, and poetry.lock is nice to make sure there is no surprise with dependencies changing silently.

🐣 The solution with hatch

Then I discovered hatch: https://hatch.pypa.io which was everything I was looking for in python project management tool

Hatch is a tool for managing python projects, even if it is relatively new, it is the official tool published under pypa and already has been adopted by a lot of popular python packages (FastAPI, starlette, Pydantic, uvicorn to name a few).

A major advantage is that it fully complies with the PEP, so the projects metadata are defined using PEP 621 standard. Which means that when you define metadata in the pyproject.toml its standard metadata that can be understood by most other tools in the python ecosystem (pip, build...), instead of the "proprietary" [tool.poetry] metadata in the toml. Hence Hatch does not require any tool that is not built in python: only pip is required most of the time, which is really convenient for GitHub Actions and Dockerfile where you can simply do pip install .

For local development you can install and use the hatch CLI which will handle installation, virtual envs and scripts automatically for you (similarly to npm or poetry, but faster than poetry).

I have been converting most of my pip packages to use hatch now, and it is working really well!

The code is really well written, and "do the smart thing" most of the time without bad surprises, we don't need to add a weird include in the pyproject.toml for example.

I just made the changes required to use hatch instead of poetry in the reasoner-validator repository if you are interested: https://github.com/vemonet/reasoner-validator/tree/migrate-to-hatch

I updated the README.md with the instructions to run in development with hatch (and added instructions for running the tests).

Everything worked well for me (tests, docs, API and docker):

hatch run test
hatch run api
hatch run docs
docker-compose up

These scripts definition can be found in the pyproject.toml, similarly to scripts in npm's package.json

sierra-moxon commented 1 year ago

Hi @vemonet - thank you for this! Hatch looks promising and worth considering (and you've identified some of the same things we've had to work out with poetry - particularly around environments in docker/actions. Conversely, I've had luck with pip and pyproject.toml in that pip typically works, but it could be that I'm not using this method enough to claim success).

In an effort to help debug env issues with a shared understanding of the pitfalls, we encouraged Richard to align efforts here with our team's use of poetry in general. As we all know, this can be very time-consuming (mac/linux vs. windows, poetry vs. pipenv vs. virtualenv, vs. venv, vs. hatch, not to mention package version management 🥴 -- it could be that hatch is the solution). But, we should do what's best for the Translator dev team. It might be good to bring up python best practices more widely in Translator to see if we can establish a consistent approach?

vemonet commented 1 year ago

Hi @sierra-moxon, I did not know how accomplished the global move to use poetry was, so I thought it would be interesting to share my experience

But I fully agree that we need to have common tools and best practices to develop the various Translator related tools, and once well configured poetry is a good candidate

RichardBruskiewich commented 1 year ago

@kennethmorton, @cbizon, @edeutsch, @putmantime, @cmungall - just adding you all to this particular PR simply because of its wider question about common Python dependency management tooling for Translator software. Just briefly skim it over and if you feel so inclined, share your thoughts about this.

Meanwhile, I'm kicking the can just slightly further down the road on this PR pending such comments, given the thoughtful work by @vemonet on this PR (Vincent, sorry that it has taken me so long to get back to this...)

kennethmorton commented 1 year ago

I have no experience with poetry or hatch so I will let others decide the best use case here.

vemonet commented 1 year ago

Hi @RichardBruskiewich no problem, personally the main issues I have with poetry is the lack of compliance with PEP standard metadata, and the dependencies resolution time, that can be quite long on projects with many dependencies (but it has a clearer error output than pip). Which is probably not that of a problem in the translator ecosystem

On the other hand poetry is convenient to manage multiple python version locally, for which hatch requires to install the different version yourself

If you already migrated multiple projects to poetry, and that the poetry workflow works well then there is no reason to change. If you are having issues with poetry (e.g. slow install time), then you might want to have look at hatch

RichardBruskiewich commented 1 year ago

Hi @vemonet, I don't have much jurisdiction over any other projects at the moment, and am just the most active maintainer/user (in SRI Testing) of the reasoner-validator. Given the recent popularity of poetry in various projects - Translator or otherwise - plus your (and Eric Deutsch's) recent feedback (before your hatch PR) of challenges with the original dependency management, I simply took the leap to poetry.

Once again, I wouldn't dismiss hatch totally out of hand. I could, perhaps, set up an experimental branch of this project which uses hatch, keeping it somewhat aligned with the master branch for awhile, then have folks - especially within Translator and other related projects (e.g. Monarch) to kick the tires and try it out.

Then, at some point, perhaps in a Translator Relay session discussing broad software development standards (cc: @cbizon @putmantime), the various options could be formally reviewed.