Closed BalduinLandolt closed 2 years ago
I can see this quickly getting out of hand if we store every search operation as a subcorpus in the data handler. How about a corpus building pipeline that would allow for the results of different search operations to be combined, i.e. like a little plus button at the bottom, which would add the results to a corpus in the session state. Could then be saved as file/pickle to be loaded back into the handler later, so it's not permanently in memory. If that sounds something like what you had in mind, I'll get started on it.
I'm not sure we really want to make the subcorpora persistent, at least at first... And if they only last for the runtime, we don't need to worry about things getting out of hand right away. This would allow for implementing a nice prototype that we then can discuss with team meckern/product owners. ;)
(also, I dislike the term "subcorpus" more and more. maybe we could come up with something better... maybe "group" or so? Do you have better ideas?)
The way I envisioned it was roughly as follows:
store results as group
-button, that lets you then enter a name for the groupgroup
-page, where users get a list for each group there is; and options for removing and editing groups.Does this make sense to you?
in terms of architecture, I think it should be a class, that holds
the thing I'm least sure about is, what entities we allow in groups? Only manuscripts? Or also persons and texts?
Long story short: let me know if you plan on working on that, or if I should! Would be nice to get this done as soon as possible...
groupings, as retrieved by searches, should be stored in subcorpora.
all set-theory operations (sum, intersection, ...) should be possible on subcorpora.
metadata should be retrievable on basis of subcorpora