readthedocs / readthedocs.org

The source code that powers readthedocs.org
https://readthedocs.org/
MIT License
8.06k stars 3.59k forks source link

Search: remove django-elasticsearch-dsl dependency #10730

Open stsewd opened 1 year ago

stsewd commented 1 year ago

What's the problem this feature will solve?

django-elasticsearch-dsl is useful if you have an exact mapping with a model and one or several indexes in ES. But now that we are no longer tracking all HTML files in the DB, there is no need for that package, we also never made use of the signals feature, and we always have one index per model, and we are indexing just two types of documents, HTMLFile and Project.

Our code base could be simplified a lot if we just use https://elasticsearch-py.readthedocs.io/ directly (django-es-dsl is just an abstraction of that package). And we can also delete several fields from the HTMLFile/ImportedFile model that are just used for search.

Describe the solution you'd like

Just rely on the features that https://elasticsearch-py.readthedocs.io/ provides, which is really similar to what we have with django-es-dsl, since django-es-dsl mainly subclasses the other package to make it work with django models.

Alternative solutions

Keep using django-es-dsl and require ourselves to have a model with attributes we don't need per each document that we have in ES.

Additional context

miguelgrinberg commented 6 months ago

Hello from Elastic! I wanted to clarify something that may not be well understood.

The https://django-elasticsearch-dsl.readthedocs.io/ package is a wrapper around https://elasticsearch-dsl.readthedocs.io/, which Elastic maintains as an official high-level client.

You mention above that you would like to remove django-elasticsearch-dsl and go directly to https://elasticsearch-py.readthedocs.io/, which is the official low-level Elasticsearch client we maintain.

An alternative option that you have is to remove django-elasticsearch-dsl but keep using https://elasticsearch-dsl.readthedocs.io/. With this you would remove the association between Django models and Elasticsearch, but you can continue using a class-based approach to searching your Elasticsearch indexes.

If you have any question about the two official Elastic clients, please reach out!

stsewd commented 6 months ago

@miguelgrinberg hi! thanks for the clarification. Yeah, we would just use elasticsearch-dsl.