DISSINET / InkVisitor

An open-source, browser-based front-end application for the collection of complex structured data from textual resources in history and the social sciences into a RethinkDB database for further analysis.
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

InkVisitor as a corpus manager with corpus queries #1892

Open davidzbiral opened 10 months ago

davidzbiral commented 10 months ago

Since our texts will be hosted in InkVisitor, it would be great if we could use some already existing open corpus manager and hook it into InkVisitor - to ensure queries over the corpus with selection of Rs and/or Ts to search (without the necessity of users to use something else). Corpus query would also help selective coding. I.e. find relevant keywords, mark them with anchors, and CASTEMO-code passages containing them.

Concerning selection of full-texts through Ts, Rs: people could either select Ts, or Rs. If they choose to select by Rs, simple matter. If they choose to select by Ts, then: read any R connected to any of this T (parent) or any of its sub-Ts, and list those Rs, and let the user uncheck the checkbox to exclude a specific R (e.g. if two versions of the same full-text are linked and user wants only to search in one of them.)

@GideonK with @Ptrhnk : could you search for such open possibilities whose licence allows inclusion in InkVisitor, and what acknowledgement is needed? When: after the basic text-management is implemented, as a next step.

davidzbiral commented 10 months ago

From Gideon: "The Kontext paper mentions that it can take decades to develop a good corpus manager, so we should definitely use something that is available " I.e. let us by no means develop our own, we need to find some which we can use.

davidzbiral commented 4 months ago

Develop with this in mind as more distant future: #2115.