datalab-org / datalab

datalab is a place to store experimental data and the connections between them.
https://docs.datalab-org.io
MIT License
49 stars 17 forks source link

MongoDB free text search requires indexes to be set up #56

Closed ml-evs closed 2 years ago

ml-evs commented 3 years ago

We should add these on API start-up.

https://docs.mongodb.com/v5.0/core/index-text/

ml-evs commented 3 years ago

The limitation of the MongoDB community is that you can only have one text index per collection. We could circumvent this with an auxiliary text field that just concatenates all of the fields that are currently searched over in the free text search. With a sensible string format we can keep the same display format in the web app.

jdbocarsly commented 3 years ago

Ah yes, realized that in #47 I forgot to include the "script" that I used to build the index, but here it is:

from pymongo import MongoClient

from pydatalab.config import CONFIG

client = MongoClient(CONFIG.MONGO_URI)

db = client.datalabvue

response = db.items.create_index({ "$**": "text" })

Regarding #57, the index is here set up to run over all string fields in the db, so it will come up with results where the match is to the item_id, name, description, chemical formula, etc. I believe you can also fine tune this by specifying the weighting of the different fields, though that may be an atlas-only feature.

jdbocarsly commented 3 years ago

There is apparently something wrong with this script on the server, as the text index is only including the name field, not all string fields as it is.

ml-evs commented 2 years ago

There is apparently something wrong with this script on the server, as the text index is only including the name field, not all string fields as it is.

This is #65 - the indexes seem to be fine.