BritishGeologicalSurvey / etlhelper

ETL Helper is a Python ETL library to simplify data transfer into and out of databases.
https://britishgeologicalsurvey.github.io/etlhelper/
GNU Lesser General Public License v3.0
104 stars 25 forks source link

Provide pyproject.toml for pip install #158

Closed ximenesuk closed 1 year ago

ximenesuk commented 1 year ago

When pip installing from a github branch this deprecation warning is given:

DEPRECATION: etlhelper is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at [https://github.com/pypa/pip/issues/8559]( https://github.com/pypa/pip/issues/8559

To Do

volcan01010 commented 1 year ago

We should also include a minimal setup.py so that we can make an editable installation.

https://stackoverflow.com/a/62983901/3508733

leorudczenko commented 1 year ago

The pyproject.toml file should reflect the newest changes to setup.py from pull request #151

leorudczenko commented 1 year ago

I've been testing pyproject.toml on another project and have now added a toml file to this repository. Below is a description I've created of the new build process we can follow:

Package Distribution

Required Libraries

An additional optional dependency could be added to the toml file as pkg, which would include the distribution requirements. This currently has NOT been done, but can be added.

pip install .[pkg]

Build and Upload

First, you will need to create a new git release. You should only do this after all commits for the release have been made.

Once you have created a new release, ensure you pull the newest tag matching that release locally before building local distributions. _This is because setuptools_scm uses the latest git tag versioning to update the dynamic version in the pyproject.toml config file during the build._

Then, you need to build the distribution files:

python -m build

Then you can upload the distribution of the library to PyPI:

twine upload dist/*

If you encounter the error File already exists, this is because you have likely an older version locally in your dist/ directory. To work around this, run:

twine upload --skip-existing dist/*

You can also just specify the exact build files you wish to upload:

twine upload dist/etlhelper-0.14.2*

You will then be prompted to login with your username and password. Alternatively, you can authenticate using an API token:

Enter your username: __token__
Enter your password: ******

Note: When using an API token, your username should be __token__ and your password should be your API token.

Extras

Authors

We could add multiple authors in the toml file so that it matches the README.md file. This currently has NOT been done, but can be added.

Requirements

requirements.txt my not be required after the toml file inclusion. See this discussion here for detailed information: https://github.com/pypa/pip/issues/8049

Versioneer

versioneer is now not needed for the build stage. However, it is still currently used to set the __version__ attribute of the package in etlhelper/__init__.py.

We may be able to remove versioneer if desired see below:

volcan01010 commented 1 year ago

We will need to update the Dockerfile to build using the new .toml file. While we are here, we should update to a Python 3.9 base container (slim bullseye as the bookworm container doesn't work with CentOS7) and update the internal CI files to use a more recent Docker-in-Docker.

leorudczenko commented 1 year ago

Regarding the current debate around using versioneer, I've found a post on stackoverflow with a much simpler method for assigning __version__ in package distribution: https://stackoverflow.com/a/56984285

from pkg_resources import (
    get_distribution,
    DistributionNotFound,
)

try:
    __version__ = get_distribution(__name__).version
except DistributionNotFound:
    __version__ = "0.0.0"

I have tested this on my test package and it works just fine using the current git tags setup. It is also worth noting that pkg_resources comes from setuptools.

UPDATE:

From setuptools documentation: https://setuptools.pypa.io/en/latest/pkg_resources.html

_"Use of pkg_resources is deprecated in favor of importlib.resources, importlib.metadata and their backports (importlib_resources, importlib_metadata). Some useful APIs are also provided by packaging (e.g. requirements and version parsing). Users should refrain from new usage of pkgresources and should work to port to importlib-based solutions."

Following this, we can achieve a near identical result:

from importlib.metadata import (
    PackageNotFoundError,
    version,
)

try:
    __version__ = version(__name__)
except PackageNotFoundError:
    __version__ = "0.0.0"

Again, I have tested this method of setting a __version__ on my test package and it works just fine using the current git tags setup. It is also worth noting that importlib is part of the standard library.

importlib does have some version differences:

volcan01010 commented 1 year ago

This has been done and merged into for_v1