simphony / simphony-framework

A meta-repository to simplify setup of the SimPhoNy toolset.
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Dependency definition issues #15

Open mehdisadeghi opened 9 years ago

mehdisadeghi commented 9 years ago

I have observed that throughout the Simphony project dependencies are defined implicitly. The use of >= sign instead of == and not putting the full output of pip freeze in requirements.txt file would be problematic in future.

I suggest to define direct and indirect dependencies in requirements.txt explicitly. This will let pip to install the exact versions that we are sure the application works fine with them. Moreover, in case of reusable components we can use a dependency range inside setup.py files. This will allow the component to support minor dependency upgrades while major updates would need exclusive attention of developers and an update in our side.

Since this project is like a toplevel project, I register this issue here.

itziakos commented 9 years ago

I have observed that throughout the Simphony project dependencies are defined implicitly. The use of >= sign instead of == and not putting the full output of pip freeze in requirements.txt file would be problematic in future.

Using == might be a safe solution (and it is usually the case for application development) but is not always the best for libraries. It does not make sense for example to set a version for flake8 or mock. In general it is in the developers discrepancy to decide when it makes sense to use version selection logic. For example, I think that having a specific version for simphony-common is a good choice but for mayavi the api does not change that much. I personally prefer to have >= to make it easier to test with new versions.

pip freeze is in general has more that the actual (first level) dependencies (including my personal developing tools on windows) so I do not advise people to use pip freeze for more than a starting point.

Please also note that requirement is not the official way to install python packages. The -r option is a quick a nice way to setup an environment for a specific use (e.g documentaiton build, testing, developments on windows, python 2.6 testing). In reality if we want to be right one needs a separate 'requirements.txt' for different platforms (especially if one uses freeze to create the requirements).

Having a properly defined setup.py and an updated README documentation, is the right way to go and all first level dependencies should be mentioned there, optional dependencies can be also included (see setup.py for simphony-common). Again there my preference is to use >= unless I know a case where a new version of does not work.

These are my rule of thumb. But in general, the version selection is a big subject and without specific items on the discussion one cannot make inform choices.

Since this project is like a toplevel project, I register this issue here.

I think that for version issues is better to point to specific problems on the related package(s).

itziakos commented 9 years ago

Finally as along as we use continuous integration in a sensible way (see example https://github.com/enthought/traits-enaml/blob/master/.travis.yml were we run tests using old, released and future dependencies versions) we can quickly get a sneak peak at possible future problems and act accordingly

mehdisadeghi commented 9 years ago

Using >= sign makes it easier for developers to install and work with packages specially during the development phase, but later when we move the releases to PyPI (I assume we are going to do this) then having explicit versions will assure that a user will get the correct dependencies and pip package manager would be able to download and install them correctly. Packages such as flake or similar ones are required for development but not for end users so it is not important here. For other symphony and other third party package that we rely on their API, it is much better to define the exact version. In this way we can make sure that in future, when there could be multiple newer releases for the dependencies either simphoy or third party, the package manager would be able install a working combination of dependencies. There are some rare occasions that it is important to define second level dependencies as well which I don’t find it necessary for now. The last but not least, this project like many others will not be forever in hot development phase to have dependency problems fixed when they happen. Software is subject to change, therefore I think it is much safer to define explicit dependencies at least for major releases.

itziakos commented 9 years ago

Using >= sign makes it easier for developers to install and work with packages specially during the development phase, but later when we move the releases to PyPI (I assume we are going to do this) then having explicit versions will assure that a user will get the correct dependencies and pip package manager would be able to download and install them correctly.

This is not exactly true. As I said == is advised for applications deployed on separate environment. However on a library environment it does not help much. For example pandas is very common scientific package, both simphony and pandas depend on numpy, but pandas tends to use the latest numpy for features and speed, If we set numpy==1.9.1 for simphony-common and release it on PyPi it will be likely that after a few months people will not be able to properly install simphony-common in a virtual environment that uses the latest pandas which will probably require numpy>=1.9.2.

We agree however that we should alway use an exact version for simphony-commons in simphony-frameworks (see https://github.com/simphony/simphony-framework/blob/master/simphony_packages.txt).

Packages such as flake or similar ones are required for development but not for end users so it is not important here.

We agree on that.

There are some rare occasions that it is important to define second level dependencies as well which I don’t find it necessary for now.

we also agree on that (that is why pip --freeze is not advised).

The last but not least, this project like many others will not be forever in hot development phase to have dependency problems fixed when they happen.

Having == for all depedencies, in a lot of cases it is actually hindering use of the library (see examples above). One needs to practice common sense and make the best of testing tools. If one needs to be in the conservative side one could use package>=1.4.4,<2 because one might expect that the api will not be backwards compatible when major version changes.

Software is subject to change, therefore I think it is much safer to define explicit dependencies at least for major releases.

This is problematic, it is confusing for a package at version 1.2.3.dev23 to require requirement-0.8>=0 and the release version 1.2.3 to require requirement-0.8==0

mehdisadeghi commented 8 years ago

Today I came across a good read on dependency definition problem in python projects. In that article, Kenneth Reitz, suggests a simple workflow, which is having two different requirements files for each project, one with explicit version numbers, and the other one without them.

As a result, a requirements-to-freeze.txt will contain only package names, without version information and requirements.txt will contain the output of pip freeze, after a fresh install in a virtual environment of course.

stefanoborini commented 8 years ago

Final decision on this topic?