aboutcode-org / pip-requirements-parser

a mostly correct pip requirements parsing library
MIT License
20 stars 10 forks source link

pip-requirements-parser - a mostly correct pip requirements parsing library

Copyright (c) nexB Inc. and others. Copyright (c) The pip developers (see AUTHORS.rst file) SPDX-License-Identifier: MIT Homepage: https://github.com/aboutcode-org/pip-requirements and https://www.aboutcode.org/

pip-requirements-parser is a mostly correct pip requirements parsing library ... because it uses pip's own code!

pip is the package installer for Python that is using "requirements" text files listing the packages to install.

Per https://pip.pypa.io/en/stable/reference/requirements-file-format/ :

"The requirements file format is closely tied to a number of internal
details of pip (e.g., pip’s command line options). The basic format is
relatively stable and portable but the full syntax, as described here,
is only intended for consumption by pip, and other tools should take
that into account before using it for their own purposes."

And per https://pip.pypa.io/en/stable/user_guide/#using-pip-from-your-program :

"[..] pip is a command line program. While it is implemented in Python, and
so is available from your Python code via import pip, you must not use pip’s
internal APIs in this way."

"What this means in practice is that everything inside of pip is considered
an implementation detail. Even the fact that the import name is pip is
subject to change without notice. While we do try not to break things as
much as possible, all the internal APIs can change at any time, for any
reason. It also means that we generally won’t fix issues that are a result
of using pip in an unsupported way."

Because of all this, pip requirements are notoriously difficult to parse right in all their diversity because:

This pip-requirements-parser Python library is yet another pip requirements files parser, but this time doing it hopefully correctly and doing as well as pip does it, because this is using pip's own code.

The pip-requirements-parser library offers these key advantages:

As a result, it has likely the most comprehensive requiremente parsing test suite around.

Usage


The entry point is the ``RequirementsFile`` object::

    >>> from pip_requirements_parser import RequirementsFile
    >>> rf = RequirementsFile.from_file("requirements.txt")

From there, you can dump to a dict::
    >>> rf.to_dict()

Or access the requirements (either InstallRequirement or EditableRequirement
objects)::

    >>> for req in rf.requirements:
    ...    print(req.to_dict())
    ...    print(req.dumps())

And the various other parsed elements such as options, commenst and invalid lines
that have a parsing error::

    >>> rf.options
    >>> rf.comment_lines
    >>> rf.invalid_lines

Each of these and the ``requirements`` hahve a "requirement_line" attribute
with the original text.

Finally you can get a requirements file back as a string::

    >>> rf.dumps()

Alternative
------------------

There are several other parsers that either:

- Implement their own parsing and can therefore miss some subtle differences
- Or wrap and import pip as a library, working around the lack of pip API

None of these use the approach of reusing and forking the subset of pip that is
needed to parse requirements.  The ones that wrap pip require network access
like pip does. They potentially need updating each time there is a new pip
release. The ones that reimplement pip parsing may not support all pip
specifics.

Implement a new pip parser

Reuse and wrap pip's own parser



- requirementslib https://github.com/sarugaku/requirementslib uses pip-shim
  https://github.com/sarugaku/pip-shims which is a set of "shims" around each
  pip versions in an attempt to offer an API to pip. Comes with 20+ dependencies,

- micropipenv https://github.com/thoth-station/micropipenv/blob/d0c37c1bf0aadf5149aebe2df0bf1cb12ded4c40/micropipenv.py#L53

- pip-tools https://github.com/jazzband/pip-tools/blob/9e1be05375104c56e07cdb0904e1b50b86f8b550/piptools/_compat/pip_compat.py