ArchitecturalKnowledgeAnalysis / EmailIndexer

Utility for generating Lucene indexes for collections of emails.
MIT License
1 stars 2 forks source link

Feature request: Improved tag management #2

Closed wmeijer221 closed 2 years ago

wmeijer221 commented 2 years ago

Currently, the tags you can select from are the ones that are present in the dataset (i.e. the ones you already added). I believe that for the usability of the tool managing the possible tags separately from the applied tags would be useful. Currently, I'm forced to hack this behavior by assigning all possible classes to the very first email in the dataset (see image), however, in the long-term this seems unreasonable (as you'll never be able to classify this "dummy email" now).

Separating this, would probably allow you to reduce the amount of redundant data in the database as well as currently the tags are stored per time they are added. This system could be replaced by a proper linking table instead; linking the mail id to its respective tag ids.

image

andrewlalis commented 2 years ago

I originally intended to make the schema as simple as I possibly could, in the hopes that it could be used in other applications if needed, without a lot of reliance on my own code. However, you bring up a valid point; there currently isn't anything special about tags, but it would make more sense to extend this schema and turn tags into first-class entities.

I will see if I can address this issue this afternoon/evening, but I can't promise a quick fix because it involves that I prepare a strategy for migrating old datasets to a new schema.

andrewlalis commented 2 years ago

Closing this since I've added schema changes. Still need to add dataset versioning and a strategy for upgrading old datasets, but I'll track that via separate issues.