log2timeline / plaso

Super timeline all the things
https://plaso.readthedocs.io
Apache License 2.0
1.73k stars 351 forks source link

Support install_requires in setup.py #1115

Closed dc3-plaso closed 4 years ago

dc3-plaso commented 7 years ago

Can we uncoment the install_requires in setup.py? (https://github.com/log2timeline/plaso/blob/master/setup.py#L116)

All of the neccessary dependencies for plaso are available on pypi. I don't see why this needs to be disabled.

Currently, an external python project which requires plaso has to either manually clone and run install on the requirements.txt file or specify plaso's dependencies as its own depedencies. It will be useful to just be able to set "plaso" as a dependency in the external programs install_requires and requirements.txt.

joachimmetz commented 7 years ago

Can we uncoment the install_requires in setup.py?

I rather not; pypi is not the recommended installation process and basically it will create more grief on our end than advantage.

joachimmetz commented 7 years ago

Context: https://caremad.io/posts/2013/07/setup-vs-requirement/

dc3-plaso commented 7 years ago

Understandable, but I hope you reconsider in the future.

I have found installing plaso and its dependencies through pip to be the most pain free way of creating a development release. Especially when installing in an offline environment where I can take advantage of a PyPi mirror or use a wheelhouse.

In our environment for example where we have programs that have Plaso and DFVFS as dependencies having these buildable by a single call to pip makes deployment faster and safer. The only other options are to manually clone both repositories and then call pip against the requirements.txt for both repositories or to manually specify Plaso’s and DFVFS’s dependencies as its own. Both make deployment and maintenance more difficult.

It may be a good idea to follow what was suggested in the last section of the blog post you provided. Instead of having code that populates the requirements.txt file, you can enable install_requires and then simply have "-e ." in the requirements.txt file.

joachimmetz commented 7 years ago

pip makes deployment faster and safer.

How does it make it safer? It might reduce room for error on the user side, but will introduce a whole new set of errors e.g. pinning problems, bleeding edge pypi packages that are broken, module hijacks, build environment problems.

you can enable install_requires and then simply have "-e ." in the requirements.txt file.

Yes and no, adding the flag to requirements.txt, SG.

I think a solution that could work for plaso is to have an option that enables install_requires, for example --install-dependencies, when the option is specified. Largely to prevent inexperienced people from shooting themselves in both foots, to prevent having pypi versions conflict with system packages.

Installation and dependency management are 2 separate concerns.

joachimmetz commented 7 years ago

https://pip.pypa.io/en/stable/reference/pip_install/#requirements-file-format

joachimmetz commented 7 years ago

Before any changes I'll do some reading of python documentation. It seems that the pip documentation even has concerns with using setup.py and automatic dependency resolution.

From: https://pip.pypa.io/en/stable/reference/pip_install/#hashes-from-pypi (just above the heading)

To be safe, install your project using pip and --no-deps.
dc3-plaso commented 7 years ago

How does it make it safer? It might reduce room for error on the user side, but will introduce a whole new set of errors e.g. pinning problems, bleeding edge pypi packages that are broken, module hijacks, build environment problems.

Sorry, "safer" is definitely the wrong word. I meant easier and more maintainable. To go with the philosophy described throughout the internet, the install_requires is seen as "abstract" dependencies where future versions could very well break the installation or not install (I'm looking at you, yara-python!). Including version restrictions should only be done on what you know will break the project.

I think a solution that could work for plaso is to have an option that enables install_requires, for example --install-dependencies, when the option is specified. Largely to prevent inexperienced people from shooting themselves in both foots, to prevent having pypi versions conflict with system packages.

There is already a --no-deps flag that can be used during installation. Having the opposite default from all other python packages would only cause more headaches.

The way it is now, if someone runs pip install plaso they are already guaranteed to get a non-working version of plaso, because none of the dependencies have been installed. I would prefer them to get a version of plaso that has a high probability of working. The people using pip to install plaso would be other developers who know how to deal with broken dependencies and can even report these as issues.

The pip installation method should only be recommended to developers who are using plaso as a library for their own project. Regular users should stick with a compiled release.

Before any changes I'll do some reading of python documentation. It seems that the pip documentation even has concerns with using setup.py and automatic dependency resolution. From: https://pip.pypa.io/en/stable/reference/pip_install/#hashes-from-pypi (just above the heading) To be safe, install your project using pip and --no-deps.

This warning only applies when using the hash feature for additional security. When hash checking all the dependencies in the requirements.txt file, you wouldn't want any extra dependencies to slip through because they were included in install_requires, but not in requirements.txt. Since Plaso does not supply these hashes it isn’t an issue.


This isn’t a critical issue for us. We thought this would be an easy fix. If you still have reservations about this issue, then we can continue with our current process.

joachimmetz commented 7 years ago

The people using pip to install plaso would be other developers who know how to deal with broken dependencies and can even report these as issues. Regular users should stick with a compiled release.

If this were true we would have no issues reported about installation. In practice based on the issues we see people that want to install plaso will use all methods available until they have a version that "runs", which for Plaso does not mean a version that "works".

Documentation of any kind is read when it is too late.

This also applies to some of the distributors of forensic tools e.g. https://raw.githubusercontent.com/sans-dfir/sift-bootstrap/master/bootstrap.sh.

joachimmetz commented 7 years ago

For context: https://github.com/sans-dfir/sift/issues/128

joachimmetz commented 4 years ago

Closing issue, we'll not add install_requires for now.