Closed ajmacdonald closed 3 years ago
What's the use case for having multiple documents? I thought DToC was oriented around a single volume.
Hi @sgsinclair . the CWRC-integrated version of DToC is likely to be used by people who may want to group different sets of documents into a variety of corpora. (e.g. for example, a project that collected all the journalistic contributions of Francophone women journalists in Canada may want to create one corpus that would gather all contributions of a single journalist, and another one that would include all the WWI -related contributions ; same document could be included in both corpora - if written by the journalist and dealing with WWI - hence the need to support as much as possible multi-document corpora)
Thanks. If there's a need to combine multiple documents together then I think the user should do that, using TEICorpus, for instance. We could theoretically wrap documents for the user, but there are a lot of things that can go wrong, with namespaces, processing instructions, etc.
If it still seems important we can do it (I think), but the option would likely only appear in the DToC interface and any documentation should have blinking lights warning about the perils of automated document wrapping.
I thought it was initially designed to work with multiple documents--my memory may be faulty but your comment is a surprise to me, Stefan, since I have a pretty clear memory of the Voyant version working to combine, for instance, multiple Shakespeare plays from different files into a single DToC edition.
I'm not sure what TEICorpus is. When you say the DToC interface do you mean the CWRC interface?
so for context, teiCorpus is an alternative root element (a TEI file could contain a teiCorpus root with multiple TEI children, but it's in the process of being deprecated - see discussion at http://tei-l.970651.n3.nabble.com/Nesting-TEI-and-deprecation-of-teiCorpus-td4032022.html)
Currently can only find the index if a singular XML file was uploaded. Needs to be re-worked to support the case where multiple XML files are uploaded.
https://github.com/sgsinclair/trombone/blob/262431223f202abc525e54bc5ca21a2cf63af69f/src/main/java/org/voyanttools/trombone/tool/corpus/DtocIndex.java#L99-L103