icaruseu / mom-ca

Monasterium.net (http://www.monasterium.net/mom) - repository and collaborative archive
https://github.com/icaruseu/mom-ca/wiki
GNU General Public License v3.0
17 stars 11 forks source link

Use all text content of tei:persName for index entries #1136

Closed NTsch closed 1 year ago

NTsch commented 1 year ago

Looks at all child text nodes of tei:persName to create the display frames in the index.

Closes #1135.

StephanMa commented 1 year ago

Is it correct that you will have all names without whitespaces or comma separation?

NTsch commented 1 year ago

Is it correct that you will have all names without whitespaces or comma separation?

The commas are removed, as ist the case now, but the regular whitespace should remain, like in this example: image

StephanMa commented 1 year ago

Here I tried to visualize what I meant


let $doc := <text><persName>Stephan von Berlin,</persName><persName>Daniel von und zu Wien</persName></text>

let $names := replace(string-join($doc//persName/text()), ',', '') 

return $names

results in "Stephan von BerlinDaniel von und zu Wien"

So it strongly depends on how the persons are written... Or do I get something terribly wrong?

@yngwi any thoughts?

yngwi commented 1 year ago

Hi, this index entry (for instance) contains commas, there are many similar cei:persName entries throughout the database. Shouldn't these be kept? Could you, @NTsch, please explain (sorry) what the intention of the change is w/r to the linked issue #1135 as I'm not sure how it is related. Thank you!

NTsch commented 1 year ago

Hi, this index entry (for instance) contains commas, there are many similar cei:persName entries throughout the database. Shouldn't these be kept?

@yngwi The difference between keeping and not keeping the comma is that when it is not kept, as is currently the case, the name that is displayed in the breadcrumb and the headline is Alfonso Fernández de Godoy, and when the comma is kept, it is Alfonso Fernández de Godoy,. So I assume it's currently removed to prevent that trailing comma. But in both cases, the entry works.

Could you, @NTsch, please explain (sorry) what the intention of the change is w/r to the linked issue https://github.com/icaruseu/mom-ca/issues/1135 as I'm not sure how it is related.

Sorry for the confusion, I should have described the issue more clearly. If I understand correctly, $persname is used to create the name that is displayed in <span class="breadcrumb"> and <div class="glossary-entry">. For that, it uses tei:persName/text(). However, this does not work for entries like those we have in the Fontenay Person Index, for example:

<tei:persName>
                     <tei:name>Henricus</tei:name>
                     <tei:roleName role="filius">filius Odonis ducis Burgundiae</tei:roleName>
                     <tei:roleName role="archidiaconus">archidiaconus
                           <tei:placeName>Eduensis</tei:placeName></tei:roleName>
                     <tei:roleName role="episcopus">episcopus
                        <tei:placeName/>Eduensis</tei:roleName>
                     <tei:date from="1148" to="1171"><!-- dates à ajouter, notamment jsutification 1171 --></tei:date>
                  </tei:persName>

This causes an error. By instead using a string-join() with tei:persName//text(), I once again get a single text node that can be used as a name for the breadcrumb and headline. In my example case, I then get Henricus filius Odonis ducis Burgundiae archidiaconus Eduensis episcopus Eduensis. However, @StephanMa makes a very good point - I believe the spaces there are created from the line breaks, and if the file were written differently, we would instead get something like Henricusfilius Odonis ducis BurgundiaearchidiaconusEduensisepiscopusEduensis. I'll have to rework my solution.

yngwi commented 1 year ago

@NTsch thank you for clarifying. I wonder how else this long names could be tackled for breadcrumbs. I think in this context the whole name is maybe overkill as the breadcrumb is mainly there to quickly see where you are so maybe it could be abbreviated somehow? Of course, the question would be "how?". Your above example doesn't seem to have an easy solution...

GVogeler commented 1 year ago

Why not take persName[1]/normalize-space() and use only the 15(?) first characters plus ... for the breadcrumb?

yngwi commented 1 year ago

@GVogeler That's something that I thought of but I wasn't sure if this kind of abbreviation isnt't "too" simple to be usable. It might be a good compromise though

NTsch commented 1 year ago

I've made some changes, long breadcrumbs are now shortened and ended with '...' as suggested above, and I've made adjustments in some places so that complex TEI without line breaks still results in spaces between words in the breadcrumb, the description, and the list of names. Any thoughts? @yngwi @StephanMa

StephanMa commented 1 year ago

lgtm