Closed ctschroeder closed 9 years ago
Can you double check the ANNIS metadata here? I'm seeing 1Cor_01 returning a corpus metadata value of "sahidica.mark": http://corpling.uis.georgetown.edu/annis-service/annis/meta/doc/sahidica.1corinthians/1Cor_01
oh yeah lookie lookie there
wait -- reassigned too soon -- what's going on with the Mark chapters 12-16 of the unedited sahidica.nt corpus? Why are they there?
What I think is happening is that multiple corpora contain documents with the same name, and then they both get each other's metadata. For example, both sahidica.nt and sahidica.mark contain a document called Mark_16. But since they come from different corpora, these documents should be kept distinct, and their metadata is not the same (the sahidica.nt version was not annotated by Rebecca Krawiec, but sahidica.mark was).
The ingested documents should be stored with the entire path they were brought from (sahidica.mark > Mark > Mark_16)
I have a fix in place for the Mark issues--rerunning the ingest now.
@amir-zeldes 1Cor chapters 1-9 are ready for republication; I fixed the metadata field in the 1 Cor 1 document. Oddly, the incorrect metadata field was for corpus but in the corpus level metadata not the document level metadata.
Filtering for the corpus sahidica.mark should list chapters 1-6 and 12-16 of the Gospel of Mark from the manually edited Mark Corpus. It does not. It erroneously also lists one chapter of 1Cor from the 1cor manually edited corpus and Mark chapters 12-16 of the unedited sahidica.nt corpus.
There is a basic problem with the ingest of metadata or something going on here.