DISSINET / InkVisitor

An open-source, browser-based front-end application for the collection of complex structured data from textual resources in history and the social sciences into a RethinkDB database for further analysis.
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

Discuss how to make InkVisitor into a full-text manager and (actual) annotation tool #1531

Closed davidzbiral closed 1 year ago

davidzbiral commented 1 year ago

Now, InkV does not store and manage full-texts, and the statement-level Text field is edited manually. But we absolutely need this feature.

"There will be a thorough development in this, because it is almost certain that InkV will need to become a full-text manager too, and so it will eventually really become what people read it like - an annotation tool, without losing any strength, but rather acquiring further strenghts - storing the corpus directly in it, and connecting the Text field of Statement with the actual full-text through markup at the background."

New additions:

tomaham commented 1 year ago

@davidzbiral We have discussed this partially during the prep for hck2 - there was this idea of solution that the full text would be part of the territory-statement composition, but you/we had rejected it if I recall rightly, because of connected complexities.

I would hold our horses and really discuss it first. So can we change the title of this issue to "Discuss..." from "Make...." ?

davidzbiral commented 1 year ago

To be sure, I only rejected it as work realistic before Hackathon 2. I absolutely think we should go towards having full texts in InkV in a more long-term run. In any case, don't worry - what will be the focal points of 1.4 and whether this is a part of them will be properly discussed. Name changed.

davidzbiral commented 1 year ago

Moving some info from one (previously too focused) milestone here so that it does not get lost:

Highly relevant for the DH community, digital history, as well as for qualitative sociology (CAQDAS users). Could go even beyond the support of segmented TXT files - in terms of ignoring the coding (because TEI/XML contains annotation that is, rather, an alternative to the knowledge graph constructed in InkV).

Important for:

Concerning the relevance of this area, e.g., review from the DH community on a CASTEMO tutorial proposal says: "It's not clear what kinds of text can be used as input to Castemo (could it for example handle a standard TEI document, or output from treetagger?), or whether it assumes that the text will be entered by hand."