clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
43 stars 53 forks source link

NL feedback (TEI version) #501

Closed matyaskopp closed 1 year ago

matyaskopp commented 1 year ago

@RubenvanHeusden, nice work. I like you have been able to encode ministries fully. Still, I have a few remarks:

Surprisingly small amount of members of parliament

annotation of notes

I guess these types of notes, can be encoded as head: https://github.com/RubenvanHeusden/ParlaMint/blob/8821a7bbcd479585197c99805a5e1c5138c93399/Data/ParlaMint-NL/ParlaMint-NL_2021-02-11-tweedekamer-5.xml#L84

<note>Wetsvoorstellen tot samenvoeging en herindeling van gemeenten</note>

can be:

<head>Wetsvoorstellen tot samenvoeging en herindeling van gemeenten</head>

and stenographers notes should be comment https://github.com/RubenvanHeusden/ParlaMint/blob/8821a7bbcd479585197c99805a5e1c5138c93399/Data/ParlaMint-NL/ParlaMint-NL_2021-02-11-tweedekamer-5.xml#L91

<note>De algemene beraadslaging wordt geopend.</note>

should be

<note type="comment">De algemene beraadslaging wordt geopend.</note>

missing speaker notes

Session or sitting?

I don't understand the NL conditions, can you check if the definition of session: https://github.com/RubenvanHeusden/ParlaMint/blob/8821a7bbcd479585197c99805a5e1c5138c93399/Data/ParlaMint-NL/ParlaMint-NL_2021-02-11-tweedekamer-5.xml#L14

<meeting ana="#parla.session" corresp="#TK" n="5">Session 5</meeting>

corresponds to taxonomy?

If it is correct, you are missing #parla.sitting that corresponds to one sitting day. And the lowest level of hierarchy should also be encoded in TEI/@ana, eg:

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0"
     ana="#covid" <!-- #parla.sitting or #parla.meeting-->
     xml:lang="nl"
     xml:id="ParlaMint-NL_2021-02-11-tweedekamer-5">

taxonomies translation

this will be needed in v3.1, but you can do it now

matyaskopp commented 1 year ago

@RubenvanHeusden I have found one more issue. The meeteng/@ana in component files should contain information on which house's proceeding (#parla.lower or #parla.upper) is in a file.

In the same way, it is done in the root file:

            <meeting n="28-lower" ana="#parla.lower #parla.term">28ste Tweede Kamer</meeting>
            <meeting n="29-lower" ana="#parla.lower #parla.term">29ste Tweede Kamer</meeting>
            <meeting n="34-upper" ana="#parla.upper #parla.term">34ste Eerste Kamer</meeting>
            <meeting n="35-upper" ana="#parla.upper #parla.term">35ste Eerste Kamer</meeting>
            <meeting n="36-upper" ana="#parla.upper #parla.term">36ste Eerste Kamer</meeting>
RubenvanHeusden commented 1 year ago

@matyaskopp

Thanks for your feedback, I will change this in the component files.

Just to clarify, I should change

<meeting ana="#parla.term #TK.29" corresp="#TK" n="29-lower">Meeting of the 29th Tweede Kamer</meeting>

to

<meeting ana="#parla.lower #parla.term #TK.29" corresp="#TK" n="29-lower">Meeting of the 29th Tweede Kamer</meeting>

right ? (and upper for our upper house of course)

matyaskopp commented 1 year ago

right ?

yes and every other meeting element in header (not only that describes the term). Documented in sample here: https://clarin-eric.github.io/ParlaMint/#sec-titleStmt (second sample)

so I expect:

            <meeting ana="#parla.lower #parla.meeting.regular" corresp="#TK" n="56">Meeting 56</meeting>
            <meeting ana="#parla.lower #parla.session" corresp="#TK" n="5">Session 5</meeting>
            <meeting ana="#parla.lower #parla.term #TK.29" corresp="#TK" n="29-lower">Meeting of the 29th Tweede Kamer</meeting>

and you should also add meeting element that describes sitting because the file contains one sitting

            <meeting ana="#parla.lower #parla.sitting" corresp="#TK" n="2021-02-11">Sitting 2021-02-11</meeting>

probably add lang attribute because it is in English

RubenvanHeusden commented 1 year ago

Thanks for clearing this up! I have now updated the files as you mentioned and made a commit with the updated files.

matyaskopp commented 1 year ago

@RubenvanHeusden, thanks. Closing this issue. Let me know when your annotated version of the sample is ready.