Open johnmhoran opened 6 months ago
@pombredanne This set of changes includes a portion of your detailed feedback from last week -- will address your remaining comments next.
@TG1999 @keshav-space I will also address your remaining comments on the cocoapods support I've added to packageurl-python/purl2url.py and fetchcode/package.py once I've finished updating the purlclci.py and related code and tests.
@pombredanne @AyanSinhaMahapatra I've just committed and pushed my latest PURLCLI-related changes, including adding details to the RTD re the use of the four current PURLCLI tools. I stuck with the existing RTD structure and thus made all my RTD changes in /purldb/purldb-toolkit/README.rst
.
This latest push also includes numerous detailed changes to purlcli.py
and related test code, including what I think are all of Philippe's recent detailed comments (thanks again for these). One point re the structure and content of the output for the urls
command (like the other three commands, the output is illustrated in some detail in my updated RTD entry):
Sometimes repository_download_url
and repo_download_url_by_package_type
are null
, and sometimes both are not null
. They are always the same as each other, but not always the same as download_url
. When they do have a URL value, in my exploration they always have the same URL value as download_url
. get_repo_download_url_by_package_type()
is only used once -- as a fallback value in get_repo_download_url()
. See https://github.com/package-url/packageurl-python/blob/18672be8c5a528ae69143206f66a875ed4043de7/src/packageurl/contrib/purl2url.py#L388
Thus my revised approach:
download_url
as a separate key. repository_download_url
and repo_download_url_by_package_type
into repository_download_url
-- no need to keep repo_download_url_by_package_type
as a separate key.I look forward to your comments.
For easy reference the RTD update is here (although the new TOC doesn't show up in this simple .rst file).
@pombredanne Now that I'm removing all logging/messaging, how do you want to handle a command run on a PURL whose type is not supported? Supported types are different for each command (some overlap some not) and changes over time as we add more support. I used to provide a warning like "'pkg:nginx/nginx@0.8.9' not supported with 'metadata' command",
, but this now shows up as
TypeError: 'NoneType' object is not iterable
with all of the details included in the console. I can avoid this by first testing whether the particular query returns a value, e.g., for metadata
that could be if info(purl):
, and that simply returns 0 packages for that PURL.
@pombredanne Given our recent discussions as well as the feedback you just provided re the fetchcode cocoapods work (thank you!), I think the answer to my question is: let the exception be raised and displayed in the console -- do not use if info(purl):
to avoid raising such an exception. Put another way, if the queried PURL type is not supported, let the existing underlying code raise an exception if that's how it's been written. Correct? ;-)
@pombredanne Given our recent discussions as well as the feedback you just provided re the fetchcode cocoapods work (thank you!), I think the answer to my question is: let the exception be raised and displayed in the console -- do not use
if info(purl):
to avoid raising such an exception. Put another way, if the queried PURL type is not supported, let the existing underlying code raise an exception if that's how it's been written. Correct? ;-)
Not exactly:
fetchcode is a low level library.
purlcli is both a library with callable functions and a CLI.
Library functions should let the exception propagate, but also could catch and reraise with a better structured exception subclass and message. "TypeError: 'NoneType' object is not iterable" is not a good exception and message. Instead, if you know that a type is not supported, just raise Exception(f"Unsupported package type: {purl_type}")
Or do this:
class UnsupportedPackageType(Exception):
pass
raise UnsupportedPackageType(purl_type
- In the CLI, which is an applicationa nd not a library, we could catch these exceptions (possibly design a better set of Exception subclasses), and raise proper error messages and other constructs for proper display and feedback to the user in the CLI "UI"
Thanks @pombredanne . Adding a new exception is an interesting option. For testing, in purlcli.py, I've added
class UnsupportedPackageType(Exception):
pass
and for the metadata
command, inside collect_metadata(), I've revised to read:
collected_metadata = []
try:
for release in list(info(purl)):
if release is None:
continue
release_detail = release.to_dict()
release_detail.move_to_end("purl", last=False)
collected_metadata.append(release_detail)
except:
raise UnsupportedPackageType(purl)
return collected_metadata
metadata
calls the info()
function in fetchcode/package.py, which returns None when the NoRouteAvailable
exception is triggered -- I interpret this as meaning NoRouteAvailable
== unsupported PURL type. NoRouteAvailable
is also used in package_versions.py
, which is called by our new versions
command. Perhaps NoRouteAvailable
could instead be updated to behave like the test UnsupportedPackageType
exception, or we could add the new UnsupportedPackageType
exception to fetchcode and use it directly in info()
.
For testing right now, when I run an unsupported PURL type with metadata
, the new UnsupportedPackageType
exception is raised and identifies the problematic PURL:
. . .
File "/home/jmh/dev/nexb/purldb/purldb-toolkit/src/purldb_toolkit/purlcli.py", line 160, in collect_metadata
raise UnsupportedPackageType(purl)
purldb_toolkit.purlcli.UnsupportedPackageType: pkg:nginx/nginx@0.8.9
I also added a simple test that seems to work:
def test_collect_metadata_unsupported_type(self):
purl = "pkg:nginx/nginx@0.8.9"
with pytest.raises(purlcli.UnsupportedPackageType, match=purl):
metadata_collection = purlcli.collect_metadata(purl)
@pombredanne I'm cleaning up purlcli.py
, and then the 2 test files, and will commit and push when done.
I realize you've had no time to read and reply to my question yesterday about the implementation of your example class UnsupportedPackageType(Exception)
. I'd implemented one approach of that in the metadata
command to handle PURL types that are not supported by an underlying command (e.g., metadata
and versions
, which will encounter a NoneType error for unsupported PURL types if not handled).
However, since this might not be what you have in mind, I am going to remove that for metadata
(and the test I added as well), and also remove my own way of handling that for versions
-- for now this will allow the NoneType error to bubble up for unsupported PURL types for the metadata
and versions
commands.
As discussed in this morning's huddle, purl2url.py
has no such exception handling and thus for the urls
command returns empty values for PURLs like pkg:/johnmhoran/hello
rather than raising an exception. metadata
calls info()
in fetchcode's package.py
, and versions
calls versions()
in fetchcode's package_versions.py
. Each handles non-supported types indirectly in the same way, responding to our command queries with a NoneType exception when the PURL type is not supported.
def info(url):
"""
Return data according to the `url` string
`url` string can be purl too
"""
if url:
try:
return router.process(url)
except NoRouteAvailable:
return
and
def versions(purl):
"""Return all version for a PURL."""
if purl:
try:
return router.process(purl)
except NoRouteAvailable:
return
@johnmhoran re: https://github.com/nexB/purldb/pull/436#issuecomment-2153323420
I realize you've had no time to read and reply to my question yesterday about the implementation of your example
class UnsupportedPackageType(Exception)
. I'd implemented one approach of that in themetadata
command to handle PURL types that are not supported by an underlying command (e.g.,metadata
andversions
, which will encounter a NoneType error for unsupported PURL types if not handled).
This works, but a simpler and better option in the CLI is to implement a callback instead that will parse the purl and raise an exception IFF the type is not in a supported list of types. See https://github.com/nexB/purldb/blob/4a9ba34901ac494d063d786daf3a5e002abcf9a9/purldb-toolkit/src/purldb_toolkit/purlcli.py#L1045 for an example of such a callback.
However, since this might not be what you have in mind, I am going to remove that for
metadata
(and the test I added as well), and also remove my own way of handling that forversions
-- for now this will allow the NoneType error to bubble up for unsupported PURL types for themetadata
andversions
commands.As discussed in this morning's huddle,
purl2url.py
has no such exception handling and thus for theurls
command returns empty values for PURLs likepkg:/johnmhoran/hello
rather than raising an exception.metadata
callsinfo()
in fetchcode'spackage.py
, andversions
callsversions()
in fetchcode'spackage_versions.py
. Each handles non-supported types indirectly in the same way, responding to our command queries with a NoneType exception when the PURL type is not supported.
Using a callback enable an early validation and is a better option for the command line.
def info(url): """ Return data according to the `url` string `url` string can be purl too """ if url: try: return router.process(url) except NoRouteAvailable: return
and
def versions(purl): """Return all version for a PURL.""" if purl: try: return router.process(purl) except NoRouteAvailable: return
So the behavior is to return None in these cases. This looks fine to me. You could test if you have a value if needed. But for now, keep things simple, let exceptions bubble up.
@pombredanne I did not follow all that you wrote but for now I am not making any more changes along these lines until we can discuss and I can ask questions real time. At this point my goal is to clean up the logging/messaging structure and related tests I added in purldb-toolkit and then fix the fecthcode cocoapods logging/messaging -- then we can discuss improvements going forward and I'll be eager to learn and implement.
@pombredanne I've finished removing the PURL CLI logging/messaging structure I'd added, merged the most recent main
as well as your latest commit to this PR, resolved conflicts, and otherwise updated/fixed the tests. Please take a look when you have the chance.
I've just committed and pushed my remaining updates for the metadata command in purlcli.py. Waiting for the GH checks to run.
@pombredanne @keshav-space @TG1999 All GitHub checks have passed. This PR is ready for review at your convenience.
Please note that the purlcli code relies in part on updates I've recently pushed to PRs for fetchcode (https://github.com/nexB/fetchcode/pull/119) and packageurl-python (https://github.com/package-url/packageurl-python/pull/153), which also need review when time permits.
Reference: https://github.com/nexB/purldb/issues/365