pypi / warehouse

The Python Package Index
https://pypi.org
Apache License 2.0
3.5k stars 941 forks source link

Explicitly typed in project name is not the first in relevance order #10718

Open GeorgeFischhof opened 2 years ago

GeorgeFischhof commented 2 years ago

Describe the bug

I checked my package, typed in the exact package name into search box: pluggable-info-monitor and the first 3 pages of the results do not contain my project. (I did not check all the 500 pages... The project exist on this url: https://pypi.org/project/pluggable-info-monitor/

Expected behavior The project is listed at first position, because default relevance order is used

To Reproduce type in the search box: pluggable-info-monitor

on other tab / window go to url: https://pypi.org/project/pluggable-info-monitor/

My Platform Windows 8.1 and Windows 10, Firefox latest version used

BR, George

di commented 2 years ago

I'd be in favor of checking to see if there's an exact match for the canonical project name (skipping our search index) and highlighting that before the results if there is.

dil-gfischhof commented 2 years ago

It is good :) In the meantime I was searching for another package: selfupdate, and searched with "self-update" this dashed phrase is similar: if I write with dash, the package without dash is not shown on first some pages

(I am GeorgeFischhof, just this is my company user)

miketheman commented 2 years ago

In attempts to reproduce this behavior in the development environment, I wasn't able to, as we don't have the same packages, and I struggled a bit to try and create the same-named, so I found zero-downtime-migrations.

Searching for a double-hyphenated package produces the desired results: https://pypi.org/search/?q=zero-downtime-migrations&o= We can see other packages like zero, django-downtime, and migrations later on on the results, but the most matched one surfaces first.

Here's the JSON payload we generate and submit to the ES service:

```json { "query": { "bool": { "should": [ { "bool": { "must": [ { "multi_match": { "fields": [ "author", "author_email", "description^5", "download_url", "home_page", "keywords^5", "license", "maintainer", "maintainer_email", "normalized_name^10", "platform", "summary^5" ], "query": "zero-downtime-migrations", "type": "best_fields" } } ] } }, { "prefix": { "normalized_name": "zero-downtime-migrations" } } ] } }, "suggest": { "name_suggestion": { "text": "zero-downtime-migrations", "term": { "field": "name" } } } } ```

Notable is that we're using the normalized_name value in the search, which is boosted by 10x than others, so it makes sense that. match on this field would produce desired results.

Results (lengthy):

```json { "took": 8, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 903, "relation": "eq" }, "max_score": 196.8072, "hits": [ { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "zero-downtime-migrations", "_score": 196.8072, "_source": { "name": "zero-downtime-migrations", "normalized_name": "zero-downtime-migrations", "version": [ "0.3", "0.1.12", "0.1.11", "0.1.9", "0.1.7", "0.1.6", "0.1.3", "0.1.2", "0.1.1", "0.0.6", "0.0.5", "0.0.4", "0.0.2", "0.0.1" ], "latest_version": "0.3", "summary": "django migrations without long locks", "description": "", "author": "Vladimir Koljasinskij", "author_email": "smosker@gmail.com", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/Smosker/zero-downtime-migrations", "download_url": "", "keywords": "", "platform": "", "created": "2018-01-12T12:11:08.887530", "classifiers": [ "Topic :: Software Development :: Libraries :: Python Modules", "Operating System :: OS Independent", "Programming Language :: Python", "License :: OSI Approved :: BSD License", "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Framework :: Django", "Programming Language :: Python :: 2.7", "Framework :: Django :: 1.8", "Programming Language :: Python :: 3.5", "Framework :: Django :: 1.9", "Framework :: Django :: 1.10", "Framework :: Django :: 1.11", "Framework :: Django :: 2.0" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "migrations", "_score": 95.39637, "_source": { "name": "migrations", "normalized_name": "migrations", "version": [ "0.0.3", "0.0.0" ], "latest_version": "0.0.3", "summary": "Yet another Python migration tool", "description": "Migrations\n==========\n\nSimple, cross-database migration tool for Python applications.\nInspired by `node migrations `_.\n\nStatus\n------\nThe project is in alpha now. Bugs and breaking changes will occur.\n\nRequirements\n------------\nOnly Python 3 is supported for now.\n\nInstallation\n------------\n.. code-block:: bash\n\n $ pip install migrations\n\nNotice, this distribution provides package and executable\nscript named :code:`migrate`, so check if it does not mess with\nexisting packages/scripts. Generally, you should neither install\nthis tool globally, nor install several migration tools for one project.\n\nFeatures\n--------\nTBD\n\nUsage\n-----\n.. code-block::\n\n usage: migrate [options] [action]\n\n actions:\n up [-h] [NAME|COUNT] (default) perform COUNT migrations or till\n given NAME (by default perform all available)\n down [-h] [NAME|COUNT] revert COUNT migrations or till\n given NAME (by default revert one)\n create [-h] NAME create new migration file\n\n show [-h] print all migrations in chronological order\n\n options:\n -h, --help show this help message and exit\n -v, --version show version and exit\n -d PATH, --migrations-dir PATH\n directory where migrations are stored\n -s PATH, --state-file PATH\n location of file which stores database state\n -t PATH, --template-file PATH\n location of template file for new migrations\n\nEach migration file must define functions :code:`up()` and :code:`down()`\nwithout required arguments.\n\nSimple migration example:\n\n.. code-block:: python\n\n import redis\n\n db = redis.Redis(host='localhost', port=6379)\n\n def up():\n db.rpush('used_libraries', 'migrations')\n\n def down():\n db.rpop('used_libraries', 'migrations')\n\nA bit more complex example. Let's assume that in current\nworking directory we have module named :code:`db`, which contains\nsingleton object responsible for DB connection, for example\n`PyMySQL `_ Connection object.\nCurrent working directory is the first place to be scanned for\nmodules to import.\n\n.. code-block:: python\n\n from db import connection\n\n def manage_cursor(action):\n def wrap():\n with connection.cursor() as cursor:\n action(cursor)\n connection.commit()\n return wrap\n\n @manage_cursor\n def up(cursor):\n cursor.execute(\n \"INSERT INTO used_libraries (`name`) VALUES ('migrations')\"\n )\n\n @manage_cursor\n def down(cursor):\n cursor.execute(\n \"DELETE FROM used_libraries WHERE `name`='migrations'\"\n )", "author": "Andriy Maletsky", "author_email": "andriy.maletsky@gmail.com", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/and800/migrations", "download_url": "", "keywords": "migration", "platform": "UNKNOWN", "created": "2016-10-01T18:28:57.260741", "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.5", "Topic :: Software Development :: Version Control" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "tensorflow-zero", "_score": 78.4184, "_source": { "name": "tensorflow-zero", "normalized_name": "tensorflow-zero", "version": [ "0.0.4", "0.0.3", "0.0.2" ], "latest_version": "0.0.4", "summary": "TensorFlow op that takes a tensor of int32s and outputs a copy with all but the first element set to zero", "description": "", "author": "Shkarupa Alex", "author_email": "shkarupa.alex@gmail.com", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/shkarupa-alex/zero", "download_url": "", "keywords": "tensorflow custom op", "platform": "", "created": "2018-07-26T15:07:57.295020", "classifiers": [ "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Software Development", "Topic :: Software Development :: Libraries", "Topic :: Software Development :: Libraries :: Python Modules", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Education", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "libsoc-zero", "_score": 78.4184, "_source": { "name": "libsoc_zero", "normalized_name": "libsoc-zero", "version": [ "0.0.2", "0.0.1.dev1" ], "latest_version": "0.0.2", "summary": "For using GPIO on 96boards", "description": "==================================\nLinker Mezzanine card for 96boards\n==================================\nThis is repository containing useful Python code examples of how to\ninteract with the `Linker mezzanine card starter kit for 96Boards`_.\n\nCreated by `Barry Byford`_ with hopefully other contributors.\n\n.. image:: http://linksprite.com/wiki/images/0/0f/1-5.png\n :target: http://linksprite.com/wiki/images/0/0f/1-5.png\n :alt: Linker Mezzanine card for 96boards\n\n\nDocumentation\n=============\n\nThis documentation is available http://pythonhosted.org/libsoc_zero/\n\nIf there are issues with the documentation then get involved and help make it better at\nhttps://github.com/DBOpenSource/linker_starter_kit/\n\n\nDevelopment\n===========\n\nThis project is being developed on `GitHub`_ so join in:\n\n* Provide suggestions, report bugs and ask questions as `issues`_\n* Provide examples we can use\n* Contribute to the code\n\n\nContributors\n============\n\n- `Barry Byford`_ (project maintainer)\n\n\n\n.. _Linker mezzanine card starter kit for 96Boards: http://www.96boards.org/products/mezzanine/linker-mezzanine-starter-kit/\n.. _GitHub: https://github.com/DBOpenSource/linker_starter_kit\n.. _issues: https://github.com/DBOpenSource/linker_starter_kit/issues\n.. _Barry Byford: https://github.com/ukBaz", "author": "Barry Byford", "author_email": "barry_byford@yahoo.co.uk", "home_page": "https://github.com/DBOpenSource/linker_starter_kit", "download_url": "UNKNOWN", "keywords": "GPIO 96boards development", "platform": "UNKNOWN", "created": "2016-05-05T19:36:18.483733", "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Education", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Topic :: Education", "Topic :: Home Automation", "Topic :: Software Development :: Embedded Systems", "Topic :: System :: Hardware" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "faq-migrations", "_score": 73.428925, "_source": { "name": "faq-migrations", "normalized_name": "faq-migrations", "version": [ "1.0" ], "latest_version": "1.0", "summary": "", "description": "# FAQ Alembic Git Migration\n\n> This module created as wrapper for Alembic and the main idea is \n> attaching real git branch, monitoring heads and auto-recommendations \n> for merging when developer creates new migration.\n\n# How to install\n```bash\npip install git+https://github.com/symstu/git-alembic.git\n```\n\n# Add CLI to your project\n> As standalone manager\n\n```python\nfrom faq_migrations.cli import migrations\n\n\nif __name__ == '__main__':\n migrations()\n```\n\n> As sub-group of click\n```python\nimport click\nfrom faq_migrations.cli import migrations\n\n\ncli = click.Group()\ncli.add_command(migrations)\n\n\nif __name__ == '__main__':\n cli()\n```\n\n> and run cli\n\n```bash\npython your_manager.py migrations --help\n```\n\n# List of commands:\n```\nUsage: manager.py migrations [OPTIONS] COMMAND [ARGS]...\n\nCreating of new migrations and upgrading database\n\nOptions:\n --help Show this message and exit.\n\nCommands:\n compare_history Compare local and remote history\n create Create new migration for current branch\n current Show current migration revision\n heads Show current heads\n history Show last migration, limit=20, upper=True\n init Initialize new alembic directory\n last_revision Show previous migration\n merge Merge branches or heads\n migrate Upgrade to head\n upgrade_migrations Show not yet applied migrations\n```\n\n# Config settings\n```python\nfrom faq_migrations.settings import config\n\n\n# Path to your directory with alembic.ini\nconfig.config_file_path = 'faq_migrations/migrations/' \n\n# Path to templates directory with alembic.ini and mako files\nconfig.template_path = 'faq_migrations/templates/'\n\n# Default template name\nconfig.template_name = 'git-generic'\n\n# Path to your directory with migrations\nconfig.alembic_dir = 'migrations/'\n\n# You can setup database url in this param or in alembic.ini.\n# This parameter has higher priority\nconfig.database_url = 'driver://username:pass@host:port/db_name'\n```\n> Before initializing new directory with migrations you must setup config \n> params.", "author": "Maksym Stukalo", "author_email": "stukalo.maksym@gmail.com", "maintainer": "", "maintainer_email": "", "home_page": "", "download_url": "", "keywords": "", "platform": "", "created": "2018-07-11T14:47:13.194604" } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "migrations4neo", "_score": 73.428925, "_source": { "name": "migrations4neo", "normalized_name": "migrations4neo", "version": [ "0.3", "0.2", "0.1" ], "latest_version": "0.3", "summary": "Easy neo4j migrations", "description": "", "author": "turkus", "author_email": "wojciechrola@wp.pl", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/turkus/migrations4neo", "download_url": "https://github.com/turkus/migrations4neo/tarball/0.1", "keywords": "neo4j,migrations", "platform": "UNKNOWN", "created": "2015-12-13T21:25:24.751025" } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "shpkpr", "_score": 71.99985, "_source": { "name": "shpkpr", "normalized_name": "shpkpr", "version": [ "1.0.0" ], "latest_version": "1.0.0", "summary": "shpkpr is a command-line tool designed to manage applications running on Marathon", "description": "===============================\nshpkpr\n===============================\n\n.. image:: https://img.shields.io/travis/shopkeep/shpkpr.svg\n :target: https://travis-ci.org/shopkeep/shpkpr\n\n.. image:: https://readthedocs.org/projects/shpkpr/badge/?version=latest\n :target: https://readthedocs.org/projects/shpkpr/?badge=latest\n :alt: Documentation Status\n\n\nshpkpr is a tool for controlling and observing applications/tasks running on Marathon and Chronos. shpkpr is designed to provide a simple command-line interface to Marathon and Chronos (similiar to the ``heroku`` command-line tool) for use both manually and with CI tools like jenkins.\n\n* Free software: MIT license\n* Documentation: https://shpkpr.readthedocs.org.\n\nFeatures\n--------\n\n* List/show detailed application info\n* Deploy applications (using [Jinja2](http://jinja.pocoo.org/docs/2.9/) templates)\n* Zero-downtime application deploys when used with [Marathon-LB](https://github.com/mesosphere/marathon-lb)\n* List/show detailed cron task info\n* Deploy cron tasks (using [Jinja2](http://jinja.pocoo.org/docs/2.9/) templates)", "author": "ShopKeep.com Inc.", "author_email": "developers@shopkeep.com", "home_page": "https://github.com/shopkeep/shpkpr", "download_url": "UNKNOWN", "keywords": "shpkpr mesos marathon chronos", "platform": "UNKNOWN", "created": "2017-05-03T15:17:53.040227", "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: PyPy" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "wade", "_score": 61.81451, "_source": { "name": "wade", "normalized_name": "wade", "version": [ "0.0.1.dev4" ], "latest_version": "0.0.1.dev4", "summary": "Web Application Downtime Estimation", "description": "Web Application Downtime Estimation.\n\n\n", "author": "wizardbyron", "author_email": "wizard0530@gmail.com", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/wizardbyron/wade", "download_url": "", "keywords": "url,redirection,redirect,verify,test,tests", "platform": "", "created": "2018-07-30T09:05:56.235699", "classifiers": [ "Topic :: Utilities", "Topic :: Software Development :: Testing", "Topic :: Software Development :: Testing :: Traffic Generation", "Topic :: Internet :: WWW/HTTP", "Topic :: Internet :: WWW/HTTP :: Site Management", "Operating System :: MacOS", "Operating System :: POSIX :: Linux", "Programming Language :: Python", "License :: OSI Approved :: MIT License", "Intended Audience :: System Administrators", "Development Status :: 3 - Alpha", "Environment :: Console", "Programming Language :: Python :: 3" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "django-migrations-plus", "_score": 59.68495, "_source": { "name": "django-migrations-plus", "normalized_name": "django-migrations-plus", "version": [ "0.1.0", "0.1.0.dev7.g3a49502" ], "latest_version": "0.1.0", "summary": "Provides a method to run raw SQL in migrations with multiple databases", "description": "django-migrations-plus\n======================\n\nMigrations Plus provides a method to run raw SQL in Django migrations with multiple DB connections.\n\nInstall\n-------\n\nUsing pip::\n \n $ pip install django-migrations-plus\n\nAPI\n-----\n``RunSQL(sql, reverse_sql=None, state_operations=None, db='default')``\n\nAllows running of arbitrary SQL on the database - useful for more advanced features of database backends that Django doesn’t support directly, like partial indexes.\n\nsql, and reverse_sql if provided, should be strings of SQL to run on the database. On most database backends (all but PostgreSQL), Django will split the SQL into individual statements prior to executing them. This requires installing the sqlparse Python library.\n\nThe state_operations argument is so you can supply operations that are equivalent to the SQL in terms of project state; for example, if you are manually creating a column, you should pass in a list containing an AddField operation here so that the autodetector still has an up-to-date state of the model (otherwise, when you next run makemigrations, it won’t see any operation that adds that field and so will try to run it again).\n\ndb should be a string with the name of the connection from your settings you want to run your SQL on.\n\nExample\n-------\n.. code-block:: python\n\n from django.db import migrations\n import migrations_plus\n\n\n class Migration(migrations.Migration):\n\n operations = [\n migrations_plus.RunSQL('DROP TABLE Students;') # Runs only against connection 'default'\n migrations_plus.RunSQL('DROP TABLE OtherStudents;', db='other') # Runs only against connection 'other'\n ]", "author": "Diego Lorden", "author_email": "diego.lorden@livelovely.com", "home_page": "UNKNOWN", "download_url": "UNKNOWN", "platform": "UNKNOWN", "created": "2014-10-27T22:50:13.614823", "classifiers": [ "Development Status :: 4 - Beta", "Framework :: Django", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3.4", "Topic :: Database" ] } }, { "_index": "development-9b8152bf04", "_type": "_doc", "_id": "django-elastic-migrations", "_score": 59.68495, "_source": { "name": "django-elastic-migrations", "normalized_name": "django-elastic-migrations", "version": [ "0.8.2", "0.8.1", "0.8.0", "0.7.8.post2", "0.7.8.post1", "0.7.8", "0.7.7-1", "0.7.7" ], "latest_version": "0.8.2", "summary": "Manage Elasticsearch Indexes in Django", "description": "Django Elastic Migrations\n=========================\n\n`django-elastic-migrations`_ is a Django app for creating, indexing and changing schemas of Elasticsearch indexes.\n\n\n.. image:: https://travis-ci.com/HBS-HBX/django-elastic-migrations.svg?branch=master\n :target: https://travis-ci.com/HBS-HBX/django-elastic-migrations\n :alt: Build Status\n\n\n.. image:: https://codecov.io/gh/HBS-HBX/django-elastic-migrations/branch/master/graph/badge.svg\n :target: https://codecov.io/gh/HBS-HBX/django-elastic-migrations\n :alt: codecov\n\n.. _django-elastic-migrations: https://pypi.org/project/django-elastic-migrations/\n\nOverview\n--------\n\nElastic has given us basic python tools for working with its search indexes:\n\n* `elasticsearch-py`_, a python interface to elasticsearch's REST API\n* `elasticsearch-dsl-py`_, a Django-esque way of declaring Elasticsearch schemas,\n built upon `elasticsearch-py`_\n\nDjango Elastic Migrations adapts these tools into a Django app which also:\n\n* Provides Django management commands for ``list``\\ ing indexes, as well as performing\n ``create``, ``update``, ``activate`` and ``drop`` actions on them\n* Implements concurrent bulk indexing powered by python ``multiprocessing``\n* Gives Django test hooks for Elasticsearch\n* Records a history of all actions that change Elasticsearch indexes\n* Supports AWS Elasticsearch 6.0, 6.1 (6.2 TBD; see `#3 support elasticsearch-dsl 6.2`_)\n* Enables having two or more servers share the same Elasticsearch cluster\n\n.. _elasticsearch-py: https://github.com/elastic/elasticsearch-py\n.. _elasticsearch-dsl-py: https://github.com/elastic/elasticsearch-dsl-py\n.. _#3 support elasticsearch-dsl 6.2: https://github.com/HBS-HBX/django-elastic-migrations/issues/3\n\n\nModels\n^^^^^^\n\nDjango Elastic Migrations provides comes with three Django models:\n**Index**, **IndexVersion**, and **IndexAction**:\n\n* \n **Index** - a logical reference to an Elasticsearch index.\n Each ``Index`` points to multiple ``IndexVersions``, each of which contains\n a snapshot of that ``Index`` schema at a particular time. Each ``Index`` has an\n *active* ``IndexVersion`` to which all actions are directed.\n\n* \n **IndexVersion** - a snapshot of an Elasticsearch ``Index`` schema at a particular\n point in time. The Elasticsearch index name is the name of the *Index* plus the\n primary key id of the ``IndexVersion`` model, e.g. ``movies-1``. When the schema is\n changed, a new ``IndexVersion`` is added with name ``movies-2``, etc.\n\n* \n **IndexAction** - a record of a change that impacts an ``Index``, such as updating\n the index or changing which ``IndexVersion`` is active in an ``Index``.\n\nManagement Commands\n^^^^^^^^^^^^^^^^^^^\n\nUse ``./manage.py es --help`` to see the list of all of these commands.\n\nRead Only Commands\n~~~~~~~~~~~~~~~~~~\n\n\n* ``./manage.py es_list``\n\n * help: For each *Index*\\ , list activation status and doc\n count for each of its *IndexVersions*\n * usage: ``./manage.py es_list``\n\nAction Commands\n~~~~~~~~~~~~~~~\n\nThese management commands add an Action record in the database,\nso that the history of each *Index* is recorded.\n\n\n* ``./manage.py es_create`` - create a new index.\n* ``./manage.py es_activate`` - *activate* a new ``IndexVersion``. all\n updates and reads for that ``Index`` by will then go to that version.\n* ``./manage.py es_update`` - update the documents in the index.\n* ``./manage.py es_clear`` - remove the documents from an index.\n* ``./manage.py es_drop`` - drop an index.\n* ``./manage.py es_dangerous_reset`` - erase elasticsearch and reset the\n Django Elastic Migrations models.\n\nFor each of these, use ``--help`` to see the details.\n\nUsage\n^^^^^\n\nInstallation\n~~~~~~~~~~~~\n\n#. ``pip install django-elastic-migrations``; see `django-elastic-migrations`_ on PyPI\n#. Put a reference to this package in your ``requirements.txt``\n#. Ensure that a valid ``elasticsearch-dsl-py`` version is accessible, and configure\n the path to your configured Elasticsearch singleton client in your django settings:\n ``DJANGO_ELASTIC_MIGRATIONS_ES_CLIENT = \"tests.es_config.ES_CLIENT\"``.\n There should only be one ``ES_CLIENT`` instantiated in your application.\n#. Add ``django_elastic_migrations`` to ``INSTALLED_APPS`` in your Django\n settings file\n#. Add the following information to your Django settings file:\n ::\n\n DJANGO_ELASTIC_MIGRATIONS_ES_CLIENT = \"path.to.your.singleton.ES_CLIENT\"\n # optional, any unique number for your releases to associate with indexes\n DJANGO_ELASTIC_MIGRATIONS_GET_CODEBASE_ID = subprocess.check_output(['git', 'describe', \"--tags\"]).strip()\n # optional, can be used to have multiple servers share the same \n # elasticsearch instance without conflicting\n DJANGO_ELASTIC_MIGRATIONS_ENVIRONMENT_PREFIX = \"qa1_\"\n\n#. Create the ``django_elastic_migrations`` tables by running ``./manage.py migrate``\n#. Create an ``DEMIndex``:\n ::\n\n from django_elastic_migrations.indexes import DEMIndex, DEMDocType\n from .models import Movie\n from elasticsearch_dsl import Text\n\n MoviesIndex = DEMIndex('movies')\n\n\n @MoviesIndex.doc_type\n class MovieSearchDoc(DEMDocType):\n text = TEXT_COMPLEX_ENGLISH_NGRAM_METAPHONE\n\n @classmethod\n def get_queryset(self, last_updated_datetime=None):\n \"\"\"\n return a queryset or a sliceable list of items to pass to\n get_reindex_iterator\n \"\"\"\n qs = Movie.objects.all()\n if last_updated_datetime:\n qs.filter(last_modified__gt=last_updated_datetime)\n return qs\n\n @classmethod\n def get_reindex_iterator(self, queryset):\n return [\n MovieSearchDoc(\n text=\"a little sample text\").to_dict(\n include_meta=True) for g in queryset]\n\n\n#. Add your new index to DJANGO_ELASTIC_MIGRATIONS_INDEXES in settings/common.py\n\n#. Run ``./manage.py es_list`` to see the index as available:\n ::\n\n ./manage.py es_list\n\n Available Index Definitions:\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n | Index Base Name | Index Version Name | Created | Active | Docs | Tag |\n +======================+=====================================+=========+========+=======+===========+\n | movies | | 0 | 0 | 0 | Current |\n | | | | | | (not |\n | | | | | | created) |\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n Reminder: an index version name looks like 'my_index-4', and its base index name\n looks like 'my_index'. Most Django Elastic Migrations management commands\n take the base name (in which case the activated version is used)\n or the specific index version name.\n\n\n#. Create the ``movies`` index in elasticsearch with ``./manage.py es_create movies``:\n ::\n\n $> ./manage.py es_create movies\n The doc type for index 'movies' changed; created a new index version\n 'movies-1' in elasticsearch.\n $> ./manage.py es_list\n\n Available Index Definitions:\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n | Index Base Name | Index Version Name | Created | Active | Docs | Tag |\n +======================+=====================================+=========+========+=======+===========+\n | movies | movies-1 | 1 | 0 | 0 | 07.11.005 |\n | | | | | | -93-gd101 |\n | | | | | | a1f |\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n\n Reminder: an index version name looks like 'my_index-4', and its base index name \n looks like 'my_index'. Most Django Elastic Migrations management commands \n take the base name (in which case the activated version is used) \n or the specific index version name.\n\n#. Activate the ``movies-1`` index version, so all updates and reads go to it.\n ::\n\n ./manage.py es_activate movies\n For index 'movies', activating 'movies-1' because you said so.\n\n#. Assuming you have implemented ``get_reindex_iterator``, you can call\n ``./manage.py es_update`` to update the index.\n ::\n\n $> ./manage.py es_update movies\n\n Handling update of index 'movies' using its active index version 'movies-1'\n Checking the last time update was called: \n - index version: movies-1\n - update date: never \n Getting Reindex Iterator...\n Completed with indexing movies-1\n\n $> ./manage.py es_list\n\n Available Index Definitions:\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n | Index Base Name | Index Version Name | Created | Active | Docs | Tag |\n +======================+=====================================+=========+========+=======+===========+\n | movies | movies-1 | 1 | 1 | 3 | 07.11.005 |\n | | | | | | -93-gd101 |\n | | | | | | a1f |\n +----------------------+-------------------------------------+---------+--------+-------+-----------+\n\nDeployment\n^^^^^^^^^^\n\n\n* Creating and updating a new index schema can happen before you deploy.\n For example, if your app servers are running with the ``movies-1`` index activated, and you\n have a new version of the schema you'd like to pre-index, then log into another\n server and run ``./manage.py es_create movies`` followed by\n ``./manage.py es_update movies --newer``. This will update documents in all ``movies``\n indexes that are newer than the active one.\n* After deploying, you can run\n ``./manage.py es_activate movies`` to activate the latest version. Be sure to cycle your\n gunicorn workers to ensure the change is caught by your app servers.\n* During deployment, if ``get_reindex_iterator`` is implemented in such a way as to respond\n to the datetime of the last reindex date, then you can call\n ``./manage.py es_update movies --resume``, and it will index *only those documents that have\n changed since the last reindexing*. This way you can do most of the indexing ahead of time,\n and only reindex a portion at the time of the deployment.\n\nDjango Testing\n^^^^^^^^^^^^^^\n\n\n#. (optional) update ``DJANGO_ELASTIC_MIGRATIONS_ENVIRONMENT_PREFIX`` in\n your Django settings. The default test prefix is ``test_``. Every\n test will create its own indexes.\n ::\n\n if 'test' in sys.argv:\n DJANGO_ELASTIC_MIGRATIONS_ENVIRONMENT_PREFIX = 'test_'\n\n#. Override TestCase - ``test_utilities.py``\n\n .. code-block::\n\n from django_elastic_migrations import DEMIndexManager\n\n class MyTestCase(TestCase):\n\n def _pre_setup(self):\n DEMIndexManager.test_pre_setup()\n super(MyTestCase, self)._pre_setup()\n\n def _post_teardown(self):\n DEMIndexManager.test_post_teardown()\n super(MyTestCase, self)._post_teardown()\n\nExcluding from Django's ``dumpdata`` command\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nWhen calling `django's dumpdata command `_\\,\nyou likely will want to exclude the database tables used in this app:\n\n::\n\n from django.core.management import call_command\n params = {\n 'database': 'default',\n 'exclude': [\n # we don't want to include django_elastic_migrations in dumpdata, \n # because it's environment specific\n 'django_elastic_migrations.index',\n 'django_elastic_migrations.indexversion',\n 'django_elastic_migrations.indexaction'\n ],\n 'indent': 3,\n 'output': 'path/to/my/file.json'\n }\n call_command('dumpdata', **params)\n\nAn example of this is included with the\n`moviegen management command`_.\n\n.. _moviegen management command: https://github.com/HBS-HBX/django-elastic-migrations/blob/master/tests/management/commands/moviegen.py\n\nTuning Bulk Indexing Parameters\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nBy default, ``/.manage.py es_update`` will divide the result of \n``DEMDocType.get_queryset()`` into batches of size ``DocType.BATCH_SIZE``. \nOverride this number to change the batch size. \n\nThere are many configurable paramters to Elasticsearch's `bulk updater `_.\nTo provide a custom value, override ``DEMDocType.get_bulk_indexing_kwargs()``\nand return the kwargs you would like to customize.\n\nDevelopment\n-----------\n\nThis project uses ``make`` to manage the build process. Type ``make help``\nto see the available ``make`` targets.\n\nElasticsearch Docker Compose\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n``docker-compose -f local.yml up``\n\n`See docs/docker_setup for more info <./docs/docker_setup.rst>`_\n\nRequirements\n^^^^^^^^^^^^\nThis project uses `pip-tools`_. The ``requirements.txt`` files are generated\nand pinned to latest versions with ``make upgrade``:\n\n* run ``make requirements`` to run the pip install.\n\n* run ``make upgrade`` to upgrade the dependencies of the requirements to the latest\n versions. This process also excludes ``django`` and ``elasticsearch-dsl``\n from the ``requirements/test.txt`` so they can be injected with different\n versions by tox during matrix testing.\n\n.. _pip-tools: https://github.com/jazzband/pip-tools\n\n\nPopulating Local ``tests_movies`` Database Table With Data\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nIt may be helpful for you to populate a local database with Movies test\ndata to experiment with using ``django-elastic-migrations``. First,\nmigrate the database:\n\n``./manage.py migrate --run-syncdb --settings=test_settings``\n\nNext, load the basic fixtures:\n\n``./manage.py loaddata tests/100films.json``\n\nYou may wish to add more movies to the database. A management command\nhas been created for this purpose. Get a `Free OMDB API key here `_\\ ,\nthen run a query like this (replace ``MYAPIKEY`` with yours):\n\n.. code-block::\n\n $> ./manage.py moviegen --title=\"Inception\" --api-key=\"MYAPIKEY\"\n {'actors': 'Leonardo DiCaprio, Joseph Gordon-Levitt, Ellen Page, Tom Hardy',\n 'awards': 'Won 4 Oscars. Another 152 wins & 204 nominations.',\n 'boxoffice': '$292,568,851',\n 'country': 'USA, UK',\n 'director': 'Christopher Nolan',\n 'dvd': '07 Dec 2010',\n 'genre': 'Action, Adventure, Sci-Fi',\n 'imdbid': 'tt1375666',\n 'imdbrating': '8.8',\n 'imdbvotes': '1,721,888',\n 'language': 'English, Japanese, French',\n 'metascore': '74',\n 'plot': 'A thief, who steals corporate secrets through the use of '\n 'dream-sharing technology, is given the inverse task of planting an '\n 'idea into the mind of a CEO.',\n 'poster': 'https://m.media-amazon.com/images/M/MV5BMjAxMzY3NjcxNF5BMl5BanBnXkFtZTcwNTI5OTM0Mw@@._V1_SX300.jpg',\n 'production': 'Warner Bros. Pictures',\n 'rated': 'PG-13',\n 'ratings': [{'Source': 'Internet Movie Database', 'Value': '8.8/10'},\n {'Source': 'Rotten Tomatoes', 'Value': '86%'},\n {'Source': 'Metacritic', 'Value': '74/100'}],\n 'released': '16 Jul 2010',\n 'response': 'True',\n 'runtime': 148,\n 'title': 'Inception',\n 'type': 'movie',\n 'website': 'http://inceptionmovie.warnerbros.com/',\n 'writer': 'Christopher Nolan',\n 'year': '2010'}\n\nTo save the movie to the database, use the ``--save`` flag. Also useful is\nthe ``--noprint`` option, to suppress json. Also, if you add\n``OMDB_API_KEY=MYAPIKEY`` to your environment variables, you don't have\nto specify it each time:\n\n.. code-block::\n\n $ ./manage.py moviegen --title \"Closer\" --noprint --save\n Saved 1 new movie(s) to the database: Closer\n\nNow that it's been saved to the database, you may want to create a fixture,\nso you can get back to this state in the future.\n\n.. code-block::\n\n $ ./manage.py moviegen --makefixture=tests/myfixture.json\n dumping fixture data to tests/myfixture.json ...\n [...........................................................................]\n\nLater, you can restore this database with the regular ``loaddata`` command:\n\n.. code-block::\n\n $ ./manage.py loaddata tests/myfixture.json\n Installed 101 object(s) from 1 fixture(s)\n\nThere are already 100 films available using ``loaddata`` as follows:\n\n.. code-block::\n\n $ ./manage.py loaddata tests/100films.json\n\nRunning Tests Locally\n^^^^^^^^^^^^^^^^^^^^^\n\nRun ``make test``. To run all tests and quality checks locally,\nrun ``make test-all``.\n\nTo just run linting, ``make quality``. Please note that if any of the\nlinters return a nonzero code, it will give an ``InvocationError`` error\nat the end. See `tox's documentation for InvocationError`_ for more information.\n\nWe use ``edx_lint`` to compile ``pylintrc``. To update the rules,\nchange ``pylintrc_tweaks`` and run ``make pylintrc``.\n\n.. _tox's documentation for InvocationError: https://tox.readthedocs.io/en/latest/example/general.html#understanding-invocationerror-exit-codes\n\nCutting a New Version\n^^^^^^^^^^^^^^^^^^^^^\n\n* optional: run ``make update`` to update dependencies\n* bump version in `django_elastic_migrations/__init__.py `_.\n* update `CHANGELOG.rst `_.\n* ``make clean``\n* ``python3 setup.py sdist bdist_wheel``\n* ``twine check dist/django-elastic-migrations-*.tar.gz`` to see if there are any syntax mistakes before tagging\n* submit PR bumping the version\n* ensure test matrix is passing on travis and merge PR\n* pull changes to master\n* ``make clean``\n* ``python3 setup.py sdist bdist_wheel``\n* ``twine check dist/django-elastic-migrations-*.tar.gz`` to see if there are any syntax mistakes before tagging\n* ``twine upload -r testpypi dist/django-elastic-migrations-*.tar.gz``\n* `Check it at https://test.pypi.org/project/django-elastic-migrations/ `_\n* ``python3 setup.py tag`` to tag the new version\n* ``twine upload -r pypi dist/django-elastic-migrations-*.tar.gz``\n* `Update new release at https://github.com/HBS-HBX/django-elastic-migrations/releases `_\n\n\nChangelog\n---------\n\n0.8.2 (2018-11-20)\n^^^^^^^^^^^^^^^^^^\n* fix `#59 twine check error in 0.8.1 `_\n\n0.8.1 (2018-11-19)\n^^^^^^^^^^^^^^^^^^\n* fix `#50 add test coverage for es_list `_\n* fix `#58 ignore indexes with a dot in name in es_list --es-only and es_dangerous_reset `_\n\n0.8.0 (2018-11-13)\n^^^^^^^^^^^^^^^^^^\n* fix `#6 support Django 2 `_\n* fix `#43 remove es_deactivate `_\n* fix `#44 add django 1.10 and 1.11 to test matrix `_\n* fix `#45 remove support for python 2 `_\n* In practice, Python 2 may work, but it is removed from the test matrix and won't be updated\n\n0.7.8 (2018-11-13)\n^^^^^^^^^^^^^^^^^^\n* fix `#7 Convert Readme to rst for pypi `_\n* first release on PyPI\n* update project dependencies\n\n0.7.7 (2018-09-17)\n^^^^^^^^^^^^^^^^^^\n* fix `#41 stack trace when indexing in py3 `_\n\n0.7.6 (2018-09-11)\n^^^^^^^^^^^^^^^^^^\n* fix `#36 es_update --start flag broken `_\n\n0.7.5 (2018-08-20)\n^^^^^^^^^^^^^^^^^^\n* fix `#35 open multiprocessing log in context handler `_\n\n0.7.4 (2018-08-15)\n^^^^^^^^^^^^^^^^^^\n* fix `#33 error when nothing to resume using --resume `_\n\n0.7.3 (2018-08-14)\n^^^^^^^^^^^^^^^^^^\n* fix #31 es_update movies --newer --workers does not store worker information\n\n0.7.2 (2018-08-13)\n^^^^^^^^^^^^^^^^^^\n* fix #21 wrong batch update total using multiprocessing in 0.7.1\n* fix #23 KeyError _index_version_name in es_update --newer\n* address #25 use pks for queryset inside workers #29\n\n0.7.1 (2018-08-07)\n^^^^^^^^^^^^^^^^^^\n* fixed gh #8 es_dangerous_reset --es-only to sync database to ES\n* fixed gh #17 make es_dangerous_reset remove dem models\n* improved test coverage\n* added tests for ``es_create --es-only``\n* added ``IndexVersion.hard_delete()`` (not called by default)\n* added ``hard_delete`` flag to ``DropIndexAction``\n* added ``hard_delete`` flag to ``DEMIndexManager.test_post_teardown()``\n* updated ``__str__()`` of ``IndexAction`` to be more descriptive\n\n0.7.0 (2018-08-06)\n^^^^^^^^^^^^^^^^^^\n* fixed gh #5: \"add python 3 support and tests\"\n\n0.6.1 (2018-08-03)\n^^^^^^^^^^^^^^^^^^\n* fixed gh #9: \"using elasticsearch-dsl 6.1, TypeError in DEMIndex.save\"\n\n0.6.0 (2018-08-01)\n^^^^^^^^^^^^^^^^^^\n* Added test structure for py2 - GH #2\n* Renamed default log handler from ``django-elastic-migrations`` to ``django_elastic_migrations``\n\n0.5.3 (2018-07-23)\n^^^^^^^^^^^^^^^^^^\n* First basic release", "author": "Harvard Business School, HBX Department", "author_email": "pnore@hbs.edu", "maintainer": "", "maintainer_email": "", "home_page": "https://github.com/HBS-HBX/django-elastic-migrations", "download_url": "", "keywords": "Django Elasticsearch", "platform": "", "created": "2018-11-19T15:18:05.291293", "classifiers": [ "License :: OSI Approved :: MIT License", "Natural Language :: English", "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Framework :: Django", "Programming Language :: Python :: 3", "Framework :: Django :: 1.9", "Programming Language :: Python :: 3.6", "Framework :: Django :: 1.10", "Framework :: Django :: 1.11", "Framework :: Django :: 2.0" ] } } ] }, "suggest": { "name_suggestion": [ { "text": "zero", "offset": 0, "length": 4, "options": [] }, { "text": "downtime", "offset": 5, "length": 8, "options": [] }, { "text": "migrations", "offset": 14, "length": 10, "options": [] } ] } } ```

In development, the explicit match comes back first, with a score of 196.8072, and the next result of migrations only scores 95.39637 - so it's curious how this kind of query would perform against the production elasticsearch service.

One thought is the age of the reported package may be relevant, since it was last published in 2017, and the mechanism for reindexing on changes changed around 2018 for incremental index updates. Is it possible this package's search metadata isn't up to date? I tried to find evidence of a periodic reindex "sweep", but couldn't find anything concrete - and without being able to investigate the production index behavior, I'm a little stuck. 😁

xloem commented 2 years ago

I'm noting my issue was merged into this one, where I made a new package that does not come up in search results when the package (which is tagged with its name) is searched for. The search results have only 2 items.

It sounds possible this issue is consolidating three different problems, or at least manifestations of problems, under one hood.

EDIT: Additional information: the first revision of the package uploaded had no description nor tags. These were added after the first revision, which I have since unfortunately deleted.

EDIT2: I tried this again 7 days later, and my particular package showed fine. I'm imagining that my specific instance was likely just a need for reindexing.

roniemartinez commented 1 year ago

OK so my issue was marked as duplicate but to be more specific compared to other issues mentioned - the top result is inverted and not the exact match.

When searching latex2mathml, the first result is mathml2latex instead of latex2mathml

image

PanderMusubi commented 8 months ago

As discussed in https://github.com/pypi/warehouse/issues/14738 perhaps when there is one space in the search, search it with a space (not an OR) and search for it with space replaced as a hyphen. Adding canonical matches to the top of the results.

PanderMusubi commented 8 months ago

Perhaps also notify after a search with a space that that is used as an OR and that quoted search escapes that.