datakurre / pip2nix

Freeze pip-installable packages into Nix expressions [maintainer=@datakurre]
3 stars 1 forks source link

Issue with license parsing on 19.09 packages #9

Closed marcinkuzminski closed 4 years ago

marcinkuzminski commented 4 years ago

Using python27 packages and nix 19.09 channel i'm getting this error while running:

pip2nix generate --licenses

path is '/nix/store/gwpv96zzwhfkw6h32nc60iadh7b2dv88-atomicwrites-1.3.0-py2.py3-none-any.whl'
ERROR: Exception:
Traceback (most recent call last):
  File "/nix/store/w9dgl3rgq6zfdla660gba48c1gw97hj8-python2.7-pip-19.2.3/lib/python2.7/site-packages/pip/_internal/cli/base_command.py", line 188, in main
    status = self.run(options, args)
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/generate.py", line 221, in run
    return requirement_set
  File "/nix/store/fpmw3j065ww8r4dwgsg852l5rw0laj45-python2.7-contexter-0.1.4/lib/python2.7/site-packages/contexter.py", line 96, in __exit__
    reraise(exc)
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/generate.py", line 220, in run
    requirement_set = self.super_run(options, args)
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/generate.py", line 207, in super_run
    resolver
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/generate.py", line 143, in process_requirements
    key=attrgetter('name'))
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/generate.py", line 142, in <genexpr>
    for pkg in sorted(packages.values(),
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/models/package.py", line 240, in to_nix
    license_nix = self.get_license_nix()
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/models/package.py", line 264, in get_license_nix
    licenses = self.get_licenses_from_pkginfo()
  File "/nix/store/76li24nidjl91xl6xmfm9wznfcscd0dn-python2.7-pip2nix-0.8.0.dev1/lib/python2.7/site-packages/pip2nix/models/package.py", line 279, in get_licenses_from_pkginfo
    data = self.pip_req.egg_info_data('PKG-INFO')
AttributeError: 'InstallRequirement' object has no attribute 'egg_info_data'
datakurre commented 4 years ago

@marcinkuzminski Thanks for this issue! I have not used al the features of pip2nix myself, so I was not really even aware of this feature. Will investigate and fix!

marcinkuzminski commented 4 years ago

Thanks for the quick reply! We're now updating our stack at RhodeCode (We use heavily nix+pip2nix)

We'll play around using latest version of nix, py38 and pip2nix.

I'll submit some more things if we find anything.

datakurre commented 4 years ago

So good to know! I couldn’t get into this today, but will try tomorrow again. I do have two guesses: 1) pip2nix relies heavily on pip private API that may have changed 2) I added support for wheels and made that the default to avoid issues with fragmented Python build and packaging tools when possible. Wheels can be disabled with a passthru argument —no-binary for pip. I believe that —no-binary :all: forces source dists for all packages.

marcinkuzminski commented 4 years ago

Yeah we played a bit on this. This diff made it work, however lots of licenses now are missing i guess this have to do with the wheel builds. We'll play with this more as your recomendation.

diff --git a/pip2nix/models/package.py b/pip2nix/models/package.py
index 301c231..011d140 100644
--- a/pip2nix/models/package.py
+++ b/pip2nix/models/package.py
@@ -7,6 +7,7 @@ from subprocess import check_output, STDOUT
 from operator import itemgetter
 from pip._vendor.packaging.requirements import Requirement
 from pip._internal.req.req_install import InstallRequirement
+from pip._internal.utils.packaging import get_metadata

 try:
     FileNotFoundError
@@ -273,12 +274,21 @@ class PythonPackage(object):

     def get_licenses_from_pkginfo(self):
         """
-        Parses the license string from PKG-INFO file.
+        Parses the license string from metadata of package
         """
         licenses = set()
-        data = self.pip_req.egg_info_data('PKG-INFO')
+        data_lines = []

-        for line in data.split('\n'):
+        try:
+            dist = self.pip_req.get_dist()
+        except Exception:
+            # This throws all sort of different exceptions
+            dist = None
+
+        if dist:
+            _meta = get_metadata(dist)
+            data_lines = _meta.as_string().splitlines()
+
+        for line in data_lines:

             # License string from setup() function.
             if line.startswith('License: '):
datakurre commented 4 years ago

@marcinkuzminski Here's a new pip2nix branch for you to test https://github.com/nix-community/pip2nix/tree/fix-licenses

I took a bit different route by relying on good old pkg_resources.

marcinkuzminski commented 4 years ago

Thanks! Let us run this against our code tests build, i'll let you know how it goes :)

marcinkuzminski commented 4 years ago

Getting back at this, seems the branch + pip2nix generate --licenses --no-binary :all: does the trick.

The only left issue is that with this it generates the checkInputs with a lot more dependencies, is this configurable somehow ?

datakurre commented 4 years ago

@marcinkuzminski Unfortunately, it seems that there is only flag to disable all indirect dependencies, but that would also disable setup_requires, which would cause issues with source distributions :(

https://github.com/datakurre/pip2nix/blob/master/pip2nix/generate.py#L102

I don't recall changing anything in that, except maybe fixing test_requires-support if it was broken and resulting in no test dependencies.

marcinkuzminski commented 4 years ago

Right, but doCheck flag is set to false by default, does it make sense to have test dependencies with that flag set to False ?

datakurre commented 4 years ago

@marcinkuzminski I cannot comment the original design, nor I have the right answer for this. We do know that in too packages one does not simply run tests and expect them to pass. Therefore False is sane default. On the other hand, if there's reason to enable tests for some critical package, it is far easier just to override the flag than figure out test dependencies from nothing :thinking: Unless there's sane way for per package configuration.

marcinkuzminski commented 4 years ago

Ok, this is a valid point... Maybe a cli flag to control that would be nice. I know this is surely a problem on our packages...

We really don't want the test packages in this case as it makes our build really big. We might just end up with a fork to handle that case then.

datakurre commented 4 years ago

@marcinkuzminski I would sure welcome a flag to completely disable test_requires/checkInputs. I'm just unable to implement that myself right now.