Open johnmhoran opened 8 months ago
@TG1999 @keshav-space Yesterday I installed requests
in my local repo fork of packageurl-python
so I could explore getting download_url
data from the pypi API ( and I am able to do that now). If I run pip list
in the command line for that local repo, I get
$ pip list
Package Version
------------------ --------
certifi 2024.2.2
charset-normalizer 3.3.2
idna 3.6
pip 24.0
requests 2.31.0
setuptools 69.1.0
urllib3 2.2.1
wheel 0.42.0
(venv) Mon Mar 11, 2024 10:30 AM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$
However, when I run bin/py.test tests/contrib/test_purl2url.py -vvs
I get the error ERROR tests/contrib/test_purl2url.py - ModuleNotFoundError: No module named 'requests'
.
I am exploring the purl2url work from my local "sandbox" -- simply another repo from inside of which I've run
pip install -e /home/jmh/dev/nexb/packageurl-python
so I can access my changes in purl2url.py from that sandbox. However, inside my forked packageurl-python repo, there is no requirements.txt
, and its setup.cfg
contains
[options]
python_requires = >=3.7
packages = find:
package_dir = =src
include_package_data = true
zip_safe = false
install_requires =
but nothing listed under install_requires
.
I think I need somehow to rerun make dev
in this local fork, perhaps preceded by adding the requests
library to the setup.cfg or creating a requirements.txt
containing requests
-- but I'm a bit reluctant to do so without confirming with you, concerned that I might mess up my local packageurl-python fork. Do you have any suggestions?
In the packageurl-python fork setup.cfg
I added:
install_requires =
requests == 2.31.0
and in /home/jmh/dev/nexb/packageurl-python
I ran pip install -e .
, but when I reran bin/py.test tests/contrib/test_purl2url.py -vvs
I again got ERROR tests/contrib/test_purl2url.py - ModuleNotFoundError: No module named 'requests'
.
This suggests to me that requests
has been installed (and BTW so does my testing yesterday from my sandbox repo of this same packageurl-python
repo/code):
(venv) Mon Mar 11, 2024 12:15 PM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$ python
Python 3.8.10 (default, Nov 22 2023, 10:22:35)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> requests.get("https://pypi.org/pypi/fetchcode")
<Response [200]>
>>> requests.get("https://pypi.org/pypi/fetchcode/json")
<Response [200]>
>>> exit()
(venv) Mon Mar 11, 2024 12:15 PM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$
Running pip install -e .
after installing requests
and updating setup.cfg
did not fix the pytest no-module-found errors for requests
in my forked packageurl-python repo -- but make clean
followed by make dev
did. Now there are a few failing tests, but that's OK. I do wonder why pip install -e .
was not sufficient to fix the pytest no-module-found error....
For the record, this was the full error from pytest:
(venv) Mon Mar 11, 2024 01:19 PM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$ bin/py.test tests/contrib/test_purl2url.py -vvs
============================================================================================== test session starts ==============================================================================================
platform linux -- Python 3.8.10, pytest-7.4.4, pluggy-1.4.0 -- /home/jmh/dev/nexb/packageurl-python/bin/python
cachedir: .pytest_cache
rootdir: /home/jmh/dev/nexb/packageurl-python
configfile: setup.cfg
collected 0 items / 2 errors
==================================================================================================== ERRORS =====================================================================================================
________________________________________________________________________________ ERROR collecting tests/contrib/test_purl2url.py ________________________________________________________________________________
tests/contrib/test_purl2url.py:29: in <module>
from packageurl.contrib import purl2url
lib/python3.8/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
exec(co, module.__dict__)
src/packageurl/contrib/purl2url.py:27: in <module>
import requests
E ModuleNotFoundError: No module named 'requests'
________________________________________________________________________________ ERROR collecting tests/contrib/test_purl2url.py ________________________________________________________________________________
ImportError while importing test module '/home/jmh/dev/nexb/packageurl-python/tests/contrib/test_purl2url.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
lib/python3.8/site-packages/_pytest/python.py:617: in _importtestmodule
mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
lib/python3.8/site-packages/_pytest/pathlib.py:567: in import_path
importlib.import_module(module_name)
/usr/lib/python3.8/importlib/__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
???
<frozen importlib._bootstrap>:991: in _find_and_load
???
<frozen importlib._bootstrap>:975: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:671: in _load_unlocked
???
lib/python3.8/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
exec(co, module.__dict__)
tests/contrib/test_purl2url.py:29: in <module>
from packageurl.contrib import purl2url
lib/python3.8/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
exec(co, module.__dict__)
src/packageurl/contrib/purl2url.py:27: in <module>
import requests
E ModuleNotFoundError: No module named 'requests'
============================================================================================ short test summary info ============================================================================================
ERROR tests/contrib/test_purl2url.py - ModuleNotFoundError: No module named 'requests'
ERROR tests/contrib/test_purl2url.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 2 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=============================================================================================== 2 errors in 0.22s ===============================================================================================
(venv) Mon Mar 11, 2024 01:19 PM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$
@johnmhoran Looking at your terminal prompt, this is what I think may be happening:
(venv) Mon Mar 11, 2024 01:19 PM /home/jmh/dev/nexb/packageurl-python jmh (143-add-purl2url-package-support)
$ bin/py.test tests/contrib/test_purl2url.py -vvs
I think you have installed packageurl-python
and requests
to the venv
virtual environment, but you are running bin/py.test
from the packageurl-python
directory, where bin/py.test
is handled by a different virtual environment than the one you're in. Try running py.test tests/contrib/test_purl2url.py -vvs
to use the py.test handled by venv
.
@JonoYang Running py.test tests/contrib/test_purl2url.py -vvs
threw an error
ImportError: No module named packageurl.contrib
But I think my running make clean
then make dev
, as I noted above, was the fix to the requests
-related ModuleNotFoundError
. Running pip install -e .
was not enough to get requests
loaded, evidently.
The 2 failing tests I now get are OK -- that's because I added the ability to actually get the pypi
download_url
for tar.gz
downloads. However I have several questions about what our goals are in these tests. One failing test looks to get a pypi .whl as a download -- it seems that's just to test in case pypi is not yet supported for downloads (as has been the case until now).
Do you have time to discuss?
@TG1999 @keshav-space @tdruez I can now get a download_url for pypi PURLs (though the code is not quite ready for prime time). Looking at the pypi JSON structure/content I get from requests.get() and at our current tests , if the few JSON examples I've seen are representative, we can retrieve either a .whl
(using "packagetype": "bdist_wheel"`) or a
.tar.gz(using
"packagetype": "sdist"`). I have drafted the pypi download_url function for now to
retrieve the basic pypi.org url if no version is included with the PURL (e.g., https://pypi.org/project/aboutcode-toolkit/
)
retrieve a .tar.gz
if a version is supplied with the PURL (e.g., https://files.pythonhosted.org/packages/6a/16/9191e46344d6a5e98afa74730340bc5f82f2c9ac7922ac4a16e58885a652/aboutcode-toolkit-3.4.0rc1.tar.gz
) (appears in download_url
and in the list for inferred_urls
)
retrieve both a .tar.gz
(appears in download_url
and in the list for inferred_urls
) and a .whl
(appears in repo_download_url
) if the PURL includes a ?download_url=
qualifier seeking a .whl
retrieve just a .tar.gz
if the PURL includes a ?download_url=
qualifier seeking a .tar.gz
(appears in download_url
, in the list for inferred_urls
and in repo_download_url
)
I see a variety of test PURL inputs and expected outputs in our tests but our actual goals for the purl2url.py output are not 100% clear. Is the approach I described above what we want? If not, please let me know what changes you want me to make in the data we retrieve. (At the risk of creating clutter, I'll paste sample output in the next comment below so you have the actual output data to examine.)
Rather than post the verbose output here I pasted to a .txt I'll upload....
packageurl-python-purl2url-pypi-sample-output-2024-03-11.txt
@TG1999 Further to your (and other) comments in the recently-closed prior PR 151, I've removed most of my prior code, and this issue -- and the new PR I'll open shortly -- now focus on adding repo URL support and testing for cocoapods (pypi support is already there and fine) and additional pypi testing.
I'll turn next to fetchcode/package.py to add download URL (and other) support for cocoapods and pypi.
@TG1999 Actually, I'd forgotten that fetchcode/package.py already handles pypi, including providing a single download URL entry (just one, as is the case for the other supported types as well, although there are often additional download files available).
I have a few questions for you and @pombredanne about the details (e.g., do we want to add the ability for additional download files as a list or otherwise) and will ask them in the related fetchcode issue I opened recently.
Re that question about multiple download files, I also raised it earlier in this issue (see this comment) -- this question is still a live question for you and @pombredanne -- I understand that I cannot simply modify the current inferred URLs function because people rely on its current form -- do we want to add this capability and, if so, how? We might want the download URL value to be a list rather than a single URL, and we might want the inferred URLs list to include more than the current repo and download URL values, but all of that would most naturally involve modifying the existing functions, which we don't want to do.
Please let me know what you think.
This is related to the PURL CLI tool/library described in https://github.com/nexB/purldb/issues/247.