Open ninpnin opened 2 months ago
I think wrapping speeches in divs is short sighted for a "living" resource -- we don't know how many things like this will be tagged and whether all potential things to tag will allow the hierarchical structure required by xml divs. Metadata blocks would allow tagging whatever features independently of other tagged features without creating a bottomless pit of divs.
Yes. This is a really good point, its a more future-proof approach.
I just realized we could use the n attributes that are available for all elements. From the documentation,
n (number) gives a number (or other label) for an element, which is not necessarily unique within the document.
Then, we would just include the ID in all u elements that belong to the speech. Eg. for the following speech with the ID i-AzXa4EUmTu6mz8YQsCpizb
<note type="speaker" xml:id="i-G36fJpDJFVqwFFQjbknRq2">
Herr ERIKSSON i Bäeckmora (cp):
</note>
<u xml:id="i-3KxGSd288AdTa9bfy9BtMv" xml:n="i-AzXa4EUmTu6mz8YQsCpizb" next="i-QzTu4nNrn4q8kU1N1u4xZC" who="i-7CXHDen9y2qKcYDisT3zjQ">
<seg xml:id="i-EV3wMeu3xQ8QuzNwmvWbjM">
Herr talman! I det som statsrådet Palme sade nu fanns väl egentligen
[...]
som en stor del av svenska folket bestämt önskar få ändring i.
</seg>
</u>
<u xml:id="i-QzTu4nNrn4q8kU1N1u4xZC" xml:n="i-AzXa4EUmTu6mz8YQsCpizb" prev="i-3KxGSd288AdTa9bfy9BtMv" who="i-7CXHDen9y2qKcYDisT3zjQ" next="i-CUrwEDJ9XoTNrw9wfqWrYb">
<seg xml:id="i-4mZ33Z1km8JtDLieMQPm5Q">
Jag anser att statsrådet Palme på denna punkt också skulle uppta
allvar-
</seg>
</u>
What happens when this same fragment gets tagged as multiple things? Do we have multiple n attribs, or multiple IDs in the n?
Which fragment are you referring to?
We should follow the TEI guidelines. n should be used for page number. https://www.tei-c.org/release/doc/tei-p5-doc/en/html/TS.html#TSBAUT https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-u.html
Where does it say that? Here it says
(number) gives a number (or other label) for an element, which is not necessarily unique within the document.
It might be a page number for pb elements, but for u elements I find no such description.
@ninpnin I refer to the fragment you posted as an example. It's tagged as a speech with ID, but down the line it may be tagged with other things... an interpellation debate, or some other type of sectioning that may or may not coincide exactly with the speech itself. So how does the approach you describe handle multiple possible xml:n values?
@BobBorges Debates are more suited for div-wrapping, along with any non-overlapping sectioning. But if we have other possibly overlapping things, I unfortunately have no solution for that.
It seems like putting these thing as element lists in the tei header would be most flexible, and cleanest in the case when a human has to look at the xml.
does the schema allow for that?
I agree with Bob, that for now the solution "List speeches in the metadata block" sounds like the best one. Would that work with the TEI schema?
I also added a third option. that is to make each speech only be one block and then rather have paragraph breaks within each utterance. It is semantically closer to the TEI schema than how we solve it now (and the id of the u block would be the speech ID). But still, I think the first solution is best.
there are a couple of options (parlaclarin):
• <TEI><standOff>
contains all kinds of stuff that could be useful here
• <teiHeader><profileDesc><textDesc>
has domain, interaction, purpose
• <teiHeader>
has a \
From the parlaclarin given examples, looks like standOff is the closest to what we want, but we could also consider like listGrp with type attrib, id, and sub elems that contain a referring ID for the segs we want to label.
Current options