CentreForDigitalHumanities / I-analyzer

The great textmining tool that obviates all others
https://ianalyzer.hum.uu.nl
MIT License
7 stars 2 forks source link

Index version management from interface #1007

Open lukavdplas opened 2 years ago

lukavdplas commented 2 years ago

1004 made me realise that we have versioned index names in production, so indexing corpora from the interface (#985) should account for version management.

The minimal implementation would be that curators do not see the different versions at all, and the application just quietly updates the alias. It would be good to have an environment setting for production (i.e. versioned index names and aliases) vs. development (none of that, clear and overwrite the index with every update) - I don't think this ever needs to be set per corpus or, as we do now, per indexing action.

In this case, the older index versions could be useful when a curator contacts us about an issue. Still, it would be nice if they could restore older versions themselves. This would save unnecessary duplicates.

The indexing menu (step 3 in #982) could show a list of all indices matching the corpus name, with the option to delete inactive indices, or switch which version is currently active.

However, old indices may not be compatible with the current corpus definition. Ideally, the application will save a "snapshot" of the corpus at the time of indexing (doable with the export option #981), and restoring an old index also means restoring the (relevant) corpus settings.

lukavdplas commented 3 weeks ago

These issues cover preparations to database models, etc.

When these are done, we can create an API and an interface in the frontend. This should be much more streamlined; proper index management requires admin privileges.

@JeltevanBoheemen and I discussed this and made a rough outline. We're envisioning these functions for the user:

If a corpus has an index, other steps in the form will still be available, but fields that affect the index will be marked with a warning sign. (Or something like that.) When the user changes those fields and hits save, they'll get a confirmation window. The existing index will become invalid, and they'll have to create a new index, or undo their changes, before they can make the corpus public again.

Notes: