aboutcode-org / pip-requirements-parser

a mostly correct pip requirements parsing library
MIT License
20 stars 10 forks source link

Leading whitespace is lost during parsing #4

Open tetsuo-cpp opened 2 years ago

tetsuo-cpp commented 2 years ago

Thanks for building pip-requirements-parser! I'm really happy with it.

I noticed that if I parse a requirements file and then print it out with dumps, any leading whitespace before a requirement or comment gets removed. Example below:

(env) tetsuo@Alexs-MacBook-Pro pip-audit % cat requirements.txt
    flask==0.5
(env) tetsuo@Alexs-MacBook-Pro pip-audit % python
Python 3.9.13 (main, May 24 2022, 21:13:51)
[Clang 13.1.6 (clang-1316.0.21.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pip_requirements_parser import RequirementsFile
>>> rf = RequirementsFile.from_file("requirements.txt")
>>> rf.dumps()
'flask==0.5'

It seems that whitespace within the line is preserved, it's just when it exists at the beginning of the line that it gets stripped.

(env) tetsuo@Alexs-MacBook-Pro pip-audit % cat base.txt
django-celery-results==2.3.1 \
    --hash=sha256:b8c9416619dbcc38f13398e31bcb1f14a228cd1e8f65fb22d3b7fc68aaa5331a \
    --hash=sha256:bf24ecc29c42e49cc7eb30b9b3739471331e2a0ca517cc88ca53a0cf3a2031d1

(env) tetsuo@Alexs-MacBook-Pro pip-audit % python
Python 3.9.13 (main, May 24 2022, 21:13:51)
[Clang 13.1.6 (clang-1316.0.21.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pip_requirements_parser import RequirementsFile
>>> rf = RequirementsFile.from_file("base.txt")
>>> rf.dumps()
'django-celery-results==2.3.1 \\\n    --hash=sha256:b8c9416619dbcc38f13398e31bcb1f14a228cd1e8f65fb22d3b7fc68aaa5331a \\\n    --hash=sha256:bf24ecc29c42e49cc7eb30b9b3739471331e2a0ca517cc88ca53a0cf3a2031d1'

This matters to pip-audit because our use case involves parsing a requirements file and then rewriting it with some of the requirements modified. So any requirements files that have leading whitespace will have the whitespace removed when we rewrite a requirements file.

The first example that comes to mind are automatic comments that are indented like those inserted by pip-compile.

Are you interested in having pip-requirements-parser support this use case? If so, I'm happy to take a stab at a PR.

pombredanne commented 2 years ago

@tetsuo-cpp Thanks! glad it can be of some help... yes, this is definitely worth having and adding and a PR would be mucho welcomed and swiftly merged!

pombredanne commented 2 years ago

Note that if tracking whitespaces requires some surgery in the underlying data model that's perfectly OK