Open jqnatividad opened 3 years ago
:+1: to using the data dictionary interface to enable indexes and having the per-column indexes disabled by default
Let's:
datastore_create
to default to not add any indexes when the index
list of fields is empty or not provided.
CKAN version 2.9.1
Describe the bug The postgres datastore creates several indices when a resource is uploaded.
A unique index for the
_id
internal field, an FTS index using the_full_text
internal field (which stores all the column values in one field for FTS searches, effectively doubling the width of the table), and one index per text column.Though this is great for ad-hoc queries using
datastore_search
anddatastore_search_sql
, and for interactive filtering on the UI, it greatly increases the datastore's storage requirements.For example, a table with 1.37 million rows and 9 columns is:
_id
, 1 fts index, and 6 indices for the 6 text columns. It did not create indices for the one timestamp and two numeric columns)I got this info while testing the expanded
datastore_info
PR.Expected behavior The team should consider giving finer granular control to the CKAN administrator beyond specifying the index method.
Some indexing "knobs" to consider:
datastore_info
), and let the DB/CKAN administrator use robust tools like PgAdmin to manage datastore indices manually. This is consistent with the keeping the Datastore API "as thin as possible to allow you to use the features you would expect from a powerful database management system."