Open GeorgeFischhof opened 2 years ago
I'd be in favor of checking to see if there's an exact match for the canonical project name (skipping our search index) and highlighting that before the results if there is.
It is good :) In the meantime I was searching for another package: selfupdate, and searched with "self-update" this dashed phrase is similar: if I write with dash, the package without dash is not shown on first some pages
(I am GeorgeFischhof, just this is my company user)
In attempts to reproduce this behavior in the development environment, I wasn't able to, as we don't have the same packages, and I struggled a bit to try and create the same-named, so I found zero-downtime-migrations
.
Searching for a double-hyphenated package produces the desired results: https://pypi.org/search/?q=zero-downtime-migrations&o= We can see other packages like zero
, django-downtime
, and migrations
later on on the results, but the most matched one surfaces first.
Here's the JSON payload we generate and submit to the ES service:
Notable is that we're using the normalized_name
value in the search, which is boosted by 10x than others, so it makes sense that. match on this field would produce desired results.
Results (lengthy):
In development, the explicit match comes back first, with a score of 196.8072, and the next result of migrations
only scores 95.39637 - so it's curious how this kind of query would perform against the production elasticsearch service.
One thought is the age of the reported package may be relevant, since it was last published in 2017, and the mechanism for reindexing on changes changed around 2018 for incremental index updates. Is it possible this package's search metadata isn't up to date? I tried to find evidence of a periodic reindex "sweep", but couldn't find anything concrete - and without being able to investigate the production index behavior, I'm a little stuck. 😁
I'm noting my issue was merged into this one, where I made a new package that does not come up in search results when the package (which is tagged with its name) is searched for. The search results have only 2 items.
It sounds possible this issue is consolidating three different problems, or at least manifestations of problems, under one hood.
EDIT: Additional information: the first revision of the package uploaded had no description nor tags. These were added after the first revision, which I have since unfortunately deleted.
EDIT2: I tried this again 7 days later, and my particular package showed fine. I'm imagining that my specific instance was likely just a need for reindexing.
OK so my issue was marked as duplicate but to be more specific compared to other issues mentioned - the top result is inverted and not the exact match.
When searching
latex2mathml
, the first result ismathml2latex
instead oflatex2mathml
As discussed in https://github.com/pypi/warehouse/issues/14738 perhaps when there is one space in the search, search it with a space (not an OR) and search for it with space replaced as a hyphen. Adding canonical matches to the top of the results.
Perhaps also notify after a search with a space that that is used as an OR and that quoted search escapes that.
Describe the bug
I checked my package, typed in the exact package name into search box: pluggable-info-monitor and the first 3 pages of the results do not contain my project. (I did not check all the 500 pages... The project exist on this url: https://pypi.org/project/pluggable-info-monitor/
Expected behavior The project is listed at first position, because default relevance order is used
To Reproduce type in the search box: pluggable-info-monitor
on other tab / window go to url: https://pypi.org/project/pluggable-info-monitor/
My Platform Windows 8.1 and Windows 10, Firefox latest version used
BR, George