open-contracting / pelican-backend

Measures the quality of OCDS data
https://pelican-backend.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Extract `data_item.data ->> 'date'` to `release_date` column #131

Open jpmckinney opened 1 month ago

jpmckinney commented 1 month ago

Similar to the index added in Kingfisher Process.

Affects pelican-backend/workers/extract/dataset_filter.py and pelican-frontend/backend/api/views.py

Older note in 001_base.sql:

-- data_item is the largest and most frequently queried table, so rarely used indexes are avoided. The queries in
-- workers/extract/dataset_filter.py and pelican-frontend/backend/api/views.py are rarely run, so we don't add the
-- indexes for data->>'date', data->'buyer'->>'name' and data->'tender'->'procuringEntity'->>'name' (text_pattern_ops).

The query in views is called for each job in the registry, so it might now make sense to add the index. Per the comment, though, we can check whether it slows down data processing to populate the index.

jpmckinney commented 1 month ago

Alternatively, add the release date as a separate column.