clearlydefined / service

The service side of
MIT License
45 stars 40 forks source link

PyPi Search #159

Open Teju-Manchenella opened 6 years ago

Teju-Manchenella commented 6 years ago

Implement pypi search functionality. Currently in originPyPi.js the route '/:name' attempts to find a package with a name that matches the request parameter 'name'. If a package exists with that name it will be returned.

Instead, this route needs the ability to return all the packages containing the request parameter 'name', not just a single package that is an exact match. From my understanding, PyPi api does not provide this functionality.

jeffmendoza commented 4 years ago

Should see if still an issue. If a limitation of the underlying PyPi service, we should document our limitation in ClearlyDefined's API and close this.

nellshamrell commented 3 years ago

Adding some notes as I investigate this:

Here's the relevant code in OriginPyPi.js


  asyncMiddleware(async (request, response) => {
    const { name } = request.params
    const url = `${name}/json`
    const answer = await requestPromise({ url, method: 'GET', json: true })
    const result = answer && ? [{ id: }] : []
    return response.status(200).send(result)

Looking at the Pypi search page, there are several packages with nginx in the name (along with one package that has the exact name nginx

I tried running a curl with the same url that would be used in OriginPyPi.js, but received a 301 in response.

nells@campusnell:~$ curl
<html><head><title>301 Moved Permanently</title></head><body><center><h1>301 Moved Permanently</h1></center></body></html>
nellshamrell commented 3 years ago

I can get the information about the package when I use


This gives me information about the exact match package name

{"info":{"author":"tphp","author_email":"","bugtrack_url":null,"classifiers":[],"description":"time and path tool","description_content_type":"","docs_url":null,"download_url":"","downloads":{"last_day":-1,"last_month":-1,"last_week":-1},"home_page":"","keywords":"pip,pathtool,timetool,magetool,mage","license":"MIT Licence","maintainer":"","maintainer_email":"","name":"nginx","package_url":"","platform":"any","project_url":"","project_urls":{"Homepage":""},"release_url":"","requires_dist":null,"requires_python":"","summary":"time and path tool","version":"0.0.1","yanked":false,"yanked_reason":null},"last_serial":7924201,"releases":{"0.0.1":[{"comment_text":"","digests":{"md5":"bc830a301bf3d07cf2e30bb564f3ff11","sha256":"9a52060402cdb9418c41656a553611f5f352d23811c5a4edfa3c9c9772c157a3"},"downloads":-1,"filename":"nginx-0.0.1.tar.gz","has_sig":false,"md5_digest":"bc830a301bf3d07cf2e30bb564f3ff11","packagetype":"sdist","python_version":"source","requires_python":null,"size":1029,"upload_time":"2020-08-10T10:05:29","upload_time_iso_8601":"2020-08-10T10:05:29.246992Z","url":"","yanked":false,"yanked_reason":null}]},"urls":[{"comment_text":"","digests":{"md5":"bc830a301bf3d07cf2e30bb564f3ff11","sha256":"9a52060402cdb9418c41656a553611f5f352d23811c5a4edfa3c9c9772c157a3"},"downloads":-1,"filename":"nginx-0.0.1.tar.gz","has_sig":false,"md5_digest":"bc830a301bf3d07cf2e30bb564f3ff11","packagetype":"sdist","python_version":"source","requires_python":null,"size":1029,"upload_time":"2020-08-10T10:05:29","upload_time_iso_8601":"2020-08-10T10:05:29.246992Z","url":"","yanked":false,"yanked_reason":null}]}

But does not give me information about any of the other packages with 'nginx' in their name.

nellshamrell commented 3 years ago

These were the only places I could find semi-recent information about the PyPi API (other than very old stack overflow questions)

This does, indeed, appear to be a limitation of the PyPi API.

@jeffmcaffer @jeffmendoza where would you suggest we document this limitation of the API when it comes to Python packages?

nellshamrell commented 3 years ago

In comparison, looking at OriginRubyGems.js


  asyncMiddleware(async (request, response) => {
    const { name } = request.params
    const url = `${name}`
    const answer = await requestPromise({ url, method: 'GET', json: true })
    const result = => {
      return { id: }
    return response.status(200).send(result)

There are also quite a few Ruby gems with nginx in their name, along with one gem with the exact name nginx

If I run the API call as defined in the above code:


Then I get quite a few gems - all with `nginx` somewhere in their name

[{"documentation_uri":"","metadata":{},"homepage_uri":"","funding_uri":null,"bug_tracker_uri":null,"project_uri":"","version":"0.0.2","sha":"33b4c47704d802c88891f6f062888a8f90f483f55d51f627bf59aad34ceb1521","platform":"ruby","changelog_uri":null,"source_code_uri":null,"licenses":null,"gem_uri":"","downloads":21293,"mailing_list_uri":null,"name":"nginx","wiki_uri":null,"version_downloads":21285,"info":"Small gem to manage nginx configuration","authors":"Kirill Radzikhovskyy"},{"documentation_uri":"","metadata":{},"homepage_uri":"","funding_uri":null,"bug_tracker_uri":null,"project_uri":"","version":"3.0.4","sha":"e7b1d85494f47f66a9574c44c951e04a61fb956e270bb22d79e89ec10eaed17c","platform":"ruby","changelog_uri":null,"source_code_uri":null,"licenses":["MIT"],"gem_uri":"","downloads":264945,"mailing_list_uri":null,"name":"capistrano3-nginx","wiki_uri":null,"version_downloads":39328,"info":"Adds suuport to nginx for Capistrano 3.x","authors":"Juan Ignacio Donoso, treenewbee"},{"documentation_uri":"","metadata":{},"homepage_uri":"","
jeffmcaffer commented 3 years ago

Hmmm, this is less than optimal. I'm guessing that this shows up for users when they are looking for a component in the UI? We could show them all the components we already know about. Not sure how to "document" this in a way that the user would see other than putting a little bit of descriptive text somewhere near the search box.

brainwane commented 3 years ago

Is relevant to this issue?