clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
50 stars 53 forks source link

ES Feedback #696

Open matyaskopp opened 1 year ago

matyaskopp commented 1 year ago

@charlicruz, @calzada

Improve note annotations

eg:

<note>Aplausos</note>

should be

<kinesic type="applause">
 <desc>Aplausos</desc>
</kinesic>

most common notes with frequencies:

  21339 <note>Aplausos</note>
   4356 <note>Rumores</note>
   3777 <note>Pausa</note>
   1629 <note>Pausa.-Una trabajadora del servicio de limpieza procede a desinfectar la tribuna de oradores</note>
    698 <note>aplausos</note>
    629 <note>EAJ-PNV</note>
    568 <note>Risas</note>
    448 <note>Protestas</note>
    326 <note>rumores</note>
    305 <note>Aplausos.-Rumores</note>
    261 <note>Rumores.-Aplausos</note>
    245 <note>Aplausos.</note>
    215 <note>La señora presidenta ocupa la Presidencia</note>
    173 <note>Continúan los rumores</note>
    161 <note>El señor vicepresidente, Prendes Prendes, ocupa la Presidencia</note>
    146 <note>Asentimiento</note>
    144 <note>Risas.-Aplausos</note>
    143 <note>Protestas.-Aplausos</note>
    136 <note>Convergència i Unió</note>
    127 <note>Risas y aplausos</note>
    123 <note>Aplausos de las señoras y los señores diputados del Grupo Parlamentario VOX, puestos en pie</note>
    119 <note>Pausa.-Una trabajadora del servicio de limpieza procede a desinfectar la tribuna de oradores.</note>
    111 <note>Prolongados aplausos</note>
    101 <note>Muestra un documento</note>
     96 <note>Aplausos.-Protestas</note>
     92 <note>La señora vicepresidenta, Navarro Garzón, ocupa la Presidencia</note>
     90 <note>La señora vicepresidenta, Villalobos Talero, ocupa la Presidencia</note>
     88 <note>El señor presidente ocupa la Presidencia</note>
     82 <note>Varios señores diputados: ¡Muy bien!-Aplausos</note>
     80 <note>Rumores y protestas</note>
     78 <note>risas</note>
     78 <note>protestas</note>
     63 <note>nueva</note>
     63 <note>Aplausos de las señoras y los señores diputados del Grupo Parlamentario Confederal de Unidos Podemos-En Comú Podem-En Marea, puestos en pie</note>
     62 <note>Una trabajadora del servicio de limpieza procede a desinfectar la tribuna de oradores</note>
     60 <note>Rumores.-Protestas</note>
     59 <note>El señor vicepresidente, Rodríguez Gómez de Celis, ocupa la Presidencia</note>
     57 <note>Aplausos.-Varios señores diputados: ¡Muy bien!</note>
     52 <note>Risas.-Rumores</note>
     51 <note>Muestra un gráfico</note>
     47 <note>PNV</note>
     46 <note>Pausa.</note>
     46 <note>muestra un documento</note>
     46 <note>Aplausos de las señoras y los señores diputados del Grupo Parlamentario Ciudadanos, puestos en pie</note>
     41 <note>Muestra una fotografía</note>
     40 <note>Varias señoras y señores diputados: ¡Muy bien!-Aplausos</note>
     40 <note>Rumores.-Risas</note>
     38 <note>Aplausos de las señoras y los señores diputados del Grupo Parlamentario Socialista, puestos en pie</note>
     37 <note>Un señor diputado: ¡Muy bien!-Aplausos</note>
     37 <note>nuevo</note>
     37 <note>Aplausos.-Un señor diputado: ¡Muy bien!</note>
     35 <note>Aplausos de las señoras y los señores diputados del Grupo Parlamentario Popular en el Congreso, puestos en pie</note>
     34 <note>Democràcia i Llibertat</note>
     32 <note>Continúan las protestas</note>
     29 <note>El señor vicepresidente, Barrero López, ocupa la Presidencia</note>
     29 <note>CONVERGÈNCIA I UNIÓ</note>
     28 <note>La señora vicepresidenta, Montserrat Montserrat, ocupa la Presidencia</note>
     27 <note>La señora vicepresidenta, Elizo Serrano, ocupa la Presidencia</note>
     26 <note>Pausa. Una trabajadora del servicio de limpieza procede a desinfectar la tribuna de oradores</note>
     26 <note>Denegación</note>
     25 <note>Un señor diputado pronuncia palabras que no se perciben</note>
     25 <note>La señora vicepresidenta, Romero Sánchez, ocupa la Presidencia</note>
     23 <note>Pronuncia palabras en catalán</note>
     23 <note>muestra un gráfico</note>
     23 <note>Aplausos.-Risas</note>

Missing who when chair

Missing who attribute https://github.com/matyaskopp/PARLAMINT-ES-MC/blob/4dc6c5f53597e2bdc3b3925a4424cb38764a4931/ParlaMint.sample/ParlaMint-ES_2015-01-20-CD150120.xml#L100-L103

<u xml:id="ParlaMint-ES_2015-01-20-CD150120.u1" ana="#chair">
  <seg xml:id="ParlaMint-ES_2015-01-20-CD150120.u1.1">Se abre la sesión.</seg>
  <seg xml:id="ParlaMint-ES_2015-01-20-CD150120.u1.2">Convalidación o derogación del Real Decreto-ley 15/2014, de 19 de diciembre, de modificación del Régimen Económico y Fiscal de Canarias. Para presentar el real decreto-ley, tiene la palabra en nombre del Gobierno el ministro de Hacienda y Administraciones Públicas.</seg>
</u>

source: https://github.com/matyaskopp/PARLAMINT-ES-MC/blob/4dc6c5f53597e2bdc3b3925a4424cb38764a4931/CD.sample/CD150120.xml#L57-L76

<speaker>
<name>UNKNOWN</name>
<birth_date>UNKNOWN</birth_date>
<birth_place country="ES">UNKNOWN</birth_place>
<status>NA</status>
<gender>UNKNOWN</gender>
<institution>
<ni country="ES">CD</ni>
</institution>
<constituency country="ES" region="UNKNOWN"/>
<affiliation>
<national_party>UNKNOWN</national_party>
<cd group="UNKNOWN"/>
</affiliation>
<post>PRESIDENTE</post>
</speaker>
<speech id="spXY" language="ES">
Se abre la sesión. 
Convalidación o derogación del Real Decreto-ley 15/2014, de 19 de diciembre, de modificación del Régimen Económico y Fiscal de Canarias. Para presentar el real decreto-ley, tiene la palabra en nombre del Gobierno el ministro de Hacienda y Administraciones Públicas. 
</speech>

chairman name is present in source file: https://github.com/matyaskopp/PARLAMINT-ES-MC/blob/4dc6c5f53597e2bdc3b3925a4424cb38764a4931/CD.sample/CD150120.xml#L52

<body>
  <chair who="JESÚS POSADA MORENO">
    <!-- all speeches -->
  </chair>
</body>

list of chairmans with frequencies:

cat CD/*.xml|grep '<chair'|sed 's/^ *//;s/\r//'|sort|uniq -c|sort -nr
    208 <chair who="MERITXELL BATET LAMAÑA">
    161 <chair who="ANA MARÍA PASTOR JULIÁN">
     56 <chair who="JESÚS POSADA MORENO">
      8 <chair who="PATXI LÓPEZ ÁLVAREZ">
      5 <chair who="NA">
      5 <chair who="ALFONSO RODRÍGUEZ GÓMEZ DE CELIS">
      3 <chair who="PATXI LÓPEZ ÁLVAREZ ">
      2 <chair who="JOSÉ IGNACIO PRENDES PRENDES">
      1 <chair who="MERITXELL BATET LAMAÑA ">
      1 <chair who="CELIA VILLALOBOS TALERO VICEPRESIDENTA PRIMERA">

No guest speakers ???

This is a bit strange. In ES parliament, there is no speaker labelled with guest category (ana="#guest")

Missing parliamentaryGroups

it seems that source data contain parliamentary groups. They are now required (https://clarin-eric.github.io/ParlaMint/#sec-parties) in ParlaMint (parties can be converted into groups or better, encode both party+groups)

ParlaMint requires that a corpus must use parliamentary groups, while the use of political parties is optional. Note that if political parties are used, it is also expected to encode which political parties constitute a parliamentary group; this is encoded via the element, as further explained in the Section on Relations between organisations.

list of parliamentary groups with number of affiliated persons

cat CD/*.xml|tr '\r\n' '  ' |sed 's/<speaker>/\n<speaker>/g;s/<\/speaker>/\n/g'|grep speaker |sed 's/^.*<name>//;s@</name.*group="@\t@;s@".*$@@;'|grep -v '<'|sort|uniq|cut -f 2|sort|uniq -c
     18 GC-CiU
      1 GC-DL
     48 GCs
     47 GCUP-EC-EM
     44 GCUP-EC-GC
      5 GEH Bildu
      5 GER
     13 GIP
     37 GMx
      1 GMX
    259 GP
     13 GPlu
     15 GR
    264 GS
      7 GUPyD
     10 GV (EAJ-PNV)
      1 GVox
     54 GVOX
     96 NA
     12 UNKNOWN

Parliamentary group - party pairs:

cat CD/*.xml|tr '\r\n' '  ' |sed 's/<speaker>/\n<speaker>/g;s/<\/speaker>/\n/g'|grep speaker |sed 's/^.*<national_party>//;s@</national_party.*group="@\t@;s@".*$@@;'|grep -v '<'|sort|uniq
AMAIUR  GMx
BNG GMx
BNG GPlu
CCa-PNC GMx
CCa-PNC-NC  GMx
CC-NC-PNC   GMx
CDC GMx
CiU GC-CiU
COMPROMÍS-Q GMx
C-P-EUPV    GCUP-EC-EM
C-P-EUPV    GMx
Cs  GCs
CUP-PR  GMx
DL  GC-DL
EAJ-PNV GV (EAJ-PNV)
ECP GCUP-EC-EM
ECP GCUP-EC-GC
ECP-GUAYEM EL CANVI GCUP-EC-GC
EC-UP   GCUP-EC-GC
EH Bildu    GEH Bildu
EH Bildu    GMx
EM-P-A-EU   GCUP-EC-EM
ERC-CATSÍ   GER
ERC-RI.cat  GMx
ERC-S   GR
EUiA    GIP
EUPV    GIP
GB  GMx
 GP GP
ICV GIP
IC-V    GMX
IZQ-PLU GIP
JxCat-JUNTS GPlu
JxCat-JUNTS(Junts)  GPlu
MÁS PAÍS-EQUO   GPlu
MÉS COMPROMÍS   GPlu
NA+ GMx
NA  NA
NC-CCa-PNC  GMx
PP-EU   GP
PP-FORO GMx
PP-FORO GP
PP  GP
PP-PAR  GP
PRC GMx
PSC(PSC-PSOE)   GS
PSC-PSOE    GS
PsdeG-PSOE  GS
PSdeG-PSOE  GS
PSdG-PSOE   GS
PSE-EE-PSOE GS
PSOEdeAndalucía GS
PSOE    GS
PSOE    NA
PSOE-NCa    GS
¡Teruel Existe! GMx
UNKNOWN UNKNOWN
UP  GCUP-EC-EM
UP  GCUP-EC-GC
UPM GCUP-EC-EM
UPN GMx
UPN-PP  GMx
UPyD    GUPyD
Vox GVox
Vox GVOX

Missing translation

https://github.com/matyaskopp/ParlaMint/blob/e48f74e3c66adb5a32b8d1051be3d2ebb58c097c/Data/ParlaMint-ES/ParlaMint-taxonomy-parla.legislature.xml#L200-L207

                  <category xml:id="parla.meeting.ceremonial">
                     <catDesc xml:lang="es">
                        <term>--</term>
                     </catDesc>
                     <catDesc xml:lang="en">
                        <term>Ceremonial meeting</term>
                     </catDesc>
                  </category>

parliamentaryGroup affiliation overlaps

I have discovered this accidentally because it produces a different error:

Error: ERROR: multiple party statuses for MartínezMaría on 2021-01-28: Coalition Opposition

   <person xml:id="MartínezMaría">
      <persName>
         <forename>María</forename>
         <forename>Luz</forename>
         <surname>Martínez</surname>
         <surname>Seijo</surname>
      </persName>
      <sex value="F"/>
      <birth when="1968-11-10"/>
      <affiliation ref="#CD" role="member" from="2016-04-19" to="2023-02-14"/>
      <affiliation role="member" ref="#party.Cs" from="2020-02-11" to="2023-02-14"/>
      <affiliation role="member" ref="#party.PP" from="2018-06-19" to="2022-12-21"/>
      <affiliation role="member"
                   ref="#party.PSOE"
                   from="2016-04-19"
                   to="2021-12-15"/>
      <affiliation role="member" ref="#party.UP" from="2016-12-13" to="2019-02-13"/>
   </person>

for this error, there can be many reasons:

calzada commented 1 year ago

To identify member of Parliament, see .+?. Here is a list of all members of Parliament as they appear in the files: 'MINISTRA DE ASUNTOS EXTERIORES, UNIÓN EUROPEA Y COOPERACIÓN' 13 occurrences 'MINISTRA DE ASUNTOS SOCIALES Y AGENDA 2030' 1 occurrences 'MINISTRA DE CIENCIA E INNOVACIÓN' 13 occurrences ' MINISTRA DE DEFENSA' 64 occurrences 'MINISTRA DE DERECHOS SOCIALES Y AGENDA 2030' 31 occurrences 'MINISTRA DE EDUCACIÓN Y FORMACIÓN PROFESIONAL' 50 occurrences 'MINISTRA DE EXTERIORES, UNIÓN EUROPEA Y COOPERACIÓN' 3 occurrences ' MINISTRA DE HACIENDA' 74 occurrences 'MINISTRA DE HACIENDA Y FUNCIÓN PÚBLICA' 180 occurrences ' MINISTRA DE HACIENDA Y PORTAVOZ DEL GOBIERNO' 5 occurrences ' MINISTRA DE IGUALDAD' 66 occurrences 'MINISTRA DE INDUSTRIA, COMERCIO Y TURISMO' 53 occurrences 'MINISTRA DE INDUSTRIA, COMERCIO Y TURISMO ' 1 occurrences ' MINISTRA DE JUSTICIA' 64 occurrences 'MINISTRA DE POLÍTICA TERRITORIAL' 16 occurrences 'MINISTRA DE POLÍTICA TERRITORIAL Y FUNCIÓN PÚBLICA' 1 occurrences 'MINISTRA DE POLÍTICA TERRITORIAL Y PORTAVOZ DEL GOBIERNO' 1 occurrences ' MINISTRA DE SANIDAD' 81 occurrences 'MINISTRA DE TRABAJO Y ECONOMÍA SOCIAL' 6 occurrences 'MINISTRA DE TRANSPORTES, MOVILIDAD Y AGENDA URBANA' 94 occurrences 'MINISTRA DE TRANSPORTES, MOVILILIDAD Y AGENDA URBANA' 1 occurrences 'MINISTRA HACIENDA Y FUNCIÓN PÚBLICA' 1 occurrences 'MINISTRO DE AGRICULTURA, PESCA Y ALIMENTACIÓN' 42 occurrences 'MINISTRO DE ASUNTOS EXTERIORES, UNIÓN EUROPEA Y COOPERACIÓN' 46 occurrences 'MINISTRO DE CIENCIA E INNOVACIÓN' 2 occurrences ' MINISTRO DE CONSUMO' 11 occurrences ' MINISTRO DE CULTURA Y DEPORTE' 16 occurrences 'MINISTRO DE INCLUSIÓN, SEGURIDAD SOCIAL Y MIGRACIONES' 70 occurrences ' MINISTRO DE JUSTICIA' 38 occurrences 'MINISTRO DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA' 139 occurrences 'MINISTRO DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA HISTÓRICA' 1 occurrences ' MINISTRO DEL INTERIOR' 184 occurrences 'MINISTRO DEL INTERIOR' 2 occurrences ' MINISTRO DE POLÍTICA TERRITORIAL Y FUNCIÓN PÚBLICA' 1 occurrences 'MINISTRO DE POLÍTICA TERRITORIAL Y FUNCIÓN PÚBLICA' 21 occurrences 'MINISTRO DE TRABAJO Y ECONOMÍA SOCIAL' 1 occurrences ' MINISTRO DE TRANSPORTES, MOVILIDAD Y AGENDA URBANA' 1 occurrences 'MINISTRO DE TRANSPORTES, MOVILIDAD Y AGENDA URBANA' 26 occurrences ' MINISTRO DE UNIVERSIDADES' 10 occurrences ' PRESIDENTE DE GOBIERNO' 1 occurrences ' PRESIDENTE DEL GOBIERNO' 343 occurrences 'PRESIDENTE DEL GOBIERNO' 7 occurrences 'VICEPRESIDENTA CUARTA DEL GOBIERNO Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 27 occurrences 'VICEPRESIDENTA CUARTA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 6 occurrences 'VICEPRESIDENTA PRIMERA, MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA' 1 occurrences 'VICEPRESIDENTA PRIMERA DEL GOBIERNO Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 31 occurrences 'VICEPRESIDENTA PRIMERA DEL GOBIERNO Y MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES E IGUALDAD' 2 occurrences 'VICEPRESIDENTA PRIMERA DEL GOBIERNO Y MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA' 54 occurrences 'VICEPRESIDENTA PRIMERA Y MINISTRA ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 3 occurrences 'VICEPRESIDENTA PRIMERA Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 191 occurrences 'VICEPRESIDENTA PRIMERA Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL ' 1 occurrences 'VICEPRESIDENTA PRIMERA Y MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA' 8 occurrences ' VICEPRESIDENTA SEGUNDA' 1 occurrences 'VICEPRESIDENTA SEGUNDA DEL GOBIERNO Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 12 occurrences 'VICEPRESIDENTA SEGUNDA DEL GOBIERNO Y MINISTRA DE TRABAJO Y ECONOMÍA SOCIAL' 14 occurrences 'VICEPRESIDENTA SEGUNDA Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 2 occurrences 'VICEPRESIDENTA SEGUNDA Y MINISTRA DE TRABAJO Y ECONOMÍA SOCIAL' 96 occurrences 'VICEPRESIDENTA TERCERA DEL GOBIERNO Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 12 occurrences 'VICEPRESIDENTA TERCERA DEL GOBIERNO Y MINISTRA DE TRABAJO Y ECONOMÍA SOCIAL' 24 occurrences 'VICEPRESIDENTA TERCERA DEL GOBIERNO Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 13 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL' 1 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA DE LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 1 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA DE TRABAJO Y ECONOMÍA SOCIAL' 7 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMODRÁGICO' 1 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 120 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO,' 3 occurrences 'VICEPRESIDENTA TERCERA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y RETO DEMOGRÁFICO' 2 occurrences 'VICEPRESIDENTA Y MINISTRA PARA LA TRANSICIÓN ECOLÓGICA Y EL RETO DEMOGRÁFICO' 1 occurrences 'VICEPRESIDENTE DEL GOBIERNO Y MINISTRO DE DERECHOS SOCIALES Y AGENDA 2030' 6 occurrences ' VICEPRESIDENTE PRIMERO' 1 occurrences 'VICEPRESIDENTE SEGUNDO DEL GOBIERNO Y MINISTRO DE DERECHO SOCIALES Y AGENDA 2030' 3 occurrences 'VICEPRESIDENTE SEGUNDO DEL GOBIERNO Y MINISTRO DE DERECHOS SOCIALES Y AGENDA 2030' 33 occurrences 'VICEPRESIDENTE SEGUNDO Y MINISTRO DE DERECHOS SOCIALES Y AGENDA 2030' 7 occurrences 'VICEPRESIENTA PRIMERA DEL GOBIERNO Y MINISTRA DE ASUNTOS ECONÓMICOS Y TRANSFORMACIÓN DIGITAL ' 1 occurrences

matyaskopp commented 1 year ago

thanks @calzada

To identify member of Parliament, see .+?.

Now I can see it, but there is no affiliation timespan. Are there changes in government during government periods? https://github.com/matyaskopp/ParlaMint/blob/e48f74e3c66adb5a32b8d1051be3d2ebb58c097c/Data/ParlaMint-ES/ParlaMint-ES-listOrg.xml#L47-L56

      <listEvent>
         <event from="2011-12-21" to="2018-06-01" xml:id="GOV.6">
            <label xml:lang="es">Séptimo Gobierno de España (21.12.2011 - 02.06-2018)</label>
            <label xml:lang="en">7th Government of Spain (21.12.2011 - 02.06-2018)</label>
         </event>
         <event from="2018-06-02" xml:id="GOV.7">
            <label xml:lang="es">Octavo Gobierno de España (02.06.2018-)</label>
            <label xml:lang="en">8th Government of Spain (02.06.2018-)</label>
         </event>
      </listEvent>

Or can the minister be affiliated for the whole period?

Is the list of ministers complete? ( = Did every minister have a speech in parliament?)

calzada commented 1 year ago

Let me just have a look this afternoon / evening. Best for now. mc

El mié, 21 jun 2023 a las 11:40, Matyáš Kopp @.***>) escribió:

thanks @calzada https://github.com/calzada

To identify member of Parliament, see .+?.

Now I can see it, but there is no affiliation timespan. Are there changes in government during government periods?

https://github.com/matyaskopp/ParlaMint/blob/e48f74e3c66adb5a32b8d1051be3d2ebb58c097c/Data/ParlaMint-ES/ParlaMint-ES-listOrg.xml#L47-L56

  <listEvent>
     <event from="2011-12-21" to="2018-06-01" xml:id="GOV.6">
        <label xml:lang="es">Séptimo Gobierno de España (21.12.2011 - 02.06-2018)</label>
        <label xml:lang="en">7th Government of Spain (21.12.2011 - 02.06-2018)</label>
     </event>
     <event from="2018-06-02" xml:id="GOV.7">
        <label xml:lang="es">Octavo Gobierno de España (02.06.2018-)</label>
        <label xml:lang="en">8th Government of Spain (02.06.2018-)</label>
     </event>
  </listEvent>

Or can the minister be affiliated for the whole period?

Is the list of ministers complete? ( = Did every minister have a speech in parliament?)

— Reply to this email directly, view it on GitHub https://github.com/clarin-eric/ParlaMint/issues/696#issuecomment-1600521089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ARERKOOBQSQFKLGG45ZTXMK6RXANCNFSM6AAAAAAZNIGNO4 . You are receiving this because you were mentioned.Message ID: @.***>

matyaskopp commented 1 year ago

I have taken a more detailed look into the content of <post> element.

Simple post

<speaker>
<name>Pastor Julián, Ana María</name>
<birth_date>19571111</birth_date>
<birth_place country="ES">Cubillos</birth_place>
<status>NA</status>
<gender>female</gender>
<institution>
<ni country="ES">CD</ni>
</institution>
<constituency country="ES" region="Madrid"/>
<affiliation>
<national_party>PP</national_party>
<cd group="GP"/>
</affiliation>
<post> VICEPRESIDENTA</post>
</speaker>

affiliations can be represented this way:

<affiliation ref="#CD" role="member" from="2015-01-21" to="2023-02-22"/> <!-- first and last seen in parliament -->
<affiliation ref="#CD" role="deputyHead"/> <!-- first and last seen in parliament (in this role) should be added/ or do we have a better source for this? -->
<!-- and also parliamentaryGroup and optionally party should be added: -->
<affiliation role="member" ref="#group.GP"/>
<affiliation role="member" ref="#party.PP"/>

Post cumulations:

<post>VICEPRESIDENTA PRIMERA DEL GOBIERNO, MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA</post>

should become (and again, an issue with unknown dates)

<affiliation ref="#GOV" role="member"/>
<affiliation ref="#GOV" role="deputyHead">
  <roleName>VICEPRESIDENTA PRIMERA DEL GOBIERNO</roleName>
</affiliation>
<affiliation ref="#GOV" role="minister">
  <roleName>MINISTRA DE LA PRESIDENCIA, RELACIONES CON LAS CORTES Y MEMORIA DEMOCRÁTICA</roleName>
</affiliation>
calzada commented 1 year ago

See this:

Second government of Pedro Sánchez - Wikipedia

Is there anything I have to do? Best mc

matyaskopp commented 1 year ago

Is there anything I have to do?

Gathering minister information from Wikipedia can be done with a script (I hope). @charlicruz or @matyaskopp can do it.


Another issue is to decide how to handle parliamentary groups and their possible relation with political parties. Is this information reachable?

  1. This is needed, for complex solution:

    • parliamentary group full names - in transcriptions there are only abbreviated ones
    • party-group representation relation timespan
    • finally coalition/opposition should show relation among parliamentary groups
  2. or we can do it easily (with a small lie - most of ParlaMinters do it):

    • change politicalParty role to parliamentaryGroup role

@calzada, are you ok with the 2nd option?

calzada commented 1 year ago

Dear Matyas, Please do whatever seems easier for you. In fact, changing political party to parliamentary group has some sense. So if this is easier for you, please do it. After we finish this update, I need to talk to you about Sketch engine. It does not select parliamentary party/group properly. But this is for later. Let me know if there is still anything I can do. Best mc

El lun, 26 jun 2023 a las 7:45, Matyáš Kopp @.***>) escribió:

Is there anything I have to do?

Gathering minister information from Wikipedia can be done with a script (I hope). @charlicruz https://github.com/charlicruz or @matyaskopp https://github.com/matyaskopp can do it.

Another issue is to decide how to handle parliamentary groups and their possible relation with political parties. Is this information reachable?

  1. This is needed, for complex solution:

    • parliamentary group full names - in transcriptions there are only abbreviated ones
    • party-group representation relation timespan
    • finally coalition/opposition should show relation among parliamentary groups
  2. or we can do it easily (with a small lie - most of ParlaMinters do it):

    • change politicalParty role to parliamentaryGroup role

@calzada https://github.com/calzada, are you ok with the 2nd option?

  • I personally prefer 2nd option because I am not sure if @charlicruz https://github.com/charlicruz is with us and changing politicalParty role to parliamentaryGroup role can be done without any need to gather additional information. This can be done by me.

— Reply to this email directly, view it on GitHub https://github.com/clarin-eric/ParlaMint/issues/696#issuecomment-1606691868, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AREXO6FQXIYUPI4XJGJTXNEOYNANCNFSM6AAAAAAZNIGNO4 . You are receiving this because you were mentioned.Message ID: @.***>

charlicruz commented 1 year ago

I can modify the politicalParty role to parliamentaryGroup for all xml files. I have uploaded the CD150120.xml example. Again, I have problems with commit and push under GitHub desktop as I have no permisssion and I uploaded directly by webpage. If it is correct, we do it for the rest, what do you think?

Another issue is there are so many UNKNOWN and I don't know how to modify it step by step I expected to have a small xml sample working but I have some problems after make compilation

Matyas, will you be available in July?

matyaskopp commented 1 year ago

I can modify the politicalParty role to parliamentaryGroup for all xml files.

This is already done:

I have uploaded the CD150120.xml example. Again, I have problems with commit and push under GitHub desktop as I have no permisssion and I uploaded directly by webpage. If it is correct, we do it for the rest, what do you think?

I don't know what should I think, you are modifying source CD format https://github.com/charlicruz/PARLAMINT-ES-MC/commit/09457fddb1c93e4067f78ddfde7f456c956b8f85 which become invalid according to https://github.com/charlicruz/PARLAMINT-ES-MC/blob/master/CD/cd.dtd You have to discuss these changes with @calzada first. I believe the best solution is to leave CD format as it is and just modify the conversion script , but you need to be up to date with my fork, because I made a lot of changes in https://github.com/matyaskopp/PARLAMINT-ES-MC/blob/master/bin/cd2parmamint.xsl

Another issue is there are so many UNKNOWN and I don't know how to modify it step by step I expected to have a small xml sample working but I have some problems after make compilation

the UNKNOWN party can be preserved, as you can see, conversion does not propagate it into TEI file: https://github.com/matyaskopp/ParlaMint/blob/a10afc44515fc57d0d46196157c0d4f8d3939afb/Data/ParlaMint-ES/ParlaMint-ES-listPerson.xml

Matyas, will you be available in July?

more or less yes


Now I am implementing a script for gathering government members from wikipedia and then integrating affiliations in <listPerson> (I hope it will not take much time - tomorrow it should be ready)

matyaskopp commented 1 year ago

@TomazErjavec I am close to finishing all necessary scripts for producing the ParlaMint-ES corpus. Can you please take a look at the sample https://github.com/clarin-eric/ParlaMint/pull/692? If there is nothing serious before I start processing the whole corpus.

TomazErjavec commented 1 year ago

Very nice indeed! I didn't do a formal validation, as you have probably done that but I noticed a few minor things:

matyaskopp commented 1 year ago

I am aware of that. I will preserve by wrong handle http://hdl.handle.net/11356/XXXX, it is safer to have totally wrong handle, instead of pointing to some existing, but wrong handle


  • utterances often have transcriber comments at the end, and, strictly speaking, they should go outside, i.e. just after the utterance; but in practice it doesn't much matter

But there will be utterances without segments or notes. We do not allow it. I have discovered a several utterances of this type: source https://www.congreso.es/public_oficiales/L14/CONG/DS/PL/DSCD-14-PL-75.PDF image

CD (https://github.com/calzada/PARLAMINT-ES-MC/blob/28684ab93851880c18fda17a526f839f2ec909a1/CD/CD210202.xml#L2004-L2024)

<intervention id='in78'>
<speaker>
<name>Bassa Coll, Montserrat</name>
<birth_date>19650420</birth_date>
<birth_place country="ES">UNKNOWN</birth_place>
<status>NA</status>
<gender>female</gender>
<institution>
<ni country="ES">CD</ni>
</institution>
<constituency country="ES" region="Girona"/>
<affiliation>
<national_party>ERC-S</national_party>
<cd group="GR"/>
</affiliation>
<post>NA</post>
</speaker>
<speech id='sp78'  language="ES">
<omit type="comment">Termina su intervención en catalán.-Aplausos</omit>.
</speech>
</intervention>

result:

            <u xml:id="ParlaMint-ES_2021-02-02-CD210202.u78"
               who="#MontserratBassaColl"
               ana="#regular">
               <vocal type="clarification">
                  <desc>Termina su intervención en catalán.-Aplausos</desc>
               </vocal>
            </u>

  • more of an aesthetic issue: you have IDs like "ParlaMint-ES_2023-02-23-CD230223.u1.1.s1.w1", it would be more consistent to have "ParlaMint-ES_2023-02-23-CD230223.u1.seg1.s1.w1

Good point, I will implement it, but I will use p prefix instead of seg (to be consistent with UA and CZ :-))

TomazErjavec commented 1 year ago

OK, good arguments for ignoring first two suggestions, and, yes, p prefix is then indeed better for the third. Good luck!