calzada / PARLAMINT-ES-MC

2 stars 4 forks source link

chairman speeches do not contain who attribute #24

Open matyaskopp opened 1 year ago

matyaskopp commented 1 year ago

The chairman's speeches do not refer to a person that speaks: https://github.com/calzada/PARLAMINT-ES-MC/blob/7d8412564b9686b396376d33f5cd9befb009f3c4/ParlaMint.sample/ParlaMint-ES_2015-01-20-CD150120.xml#L105-L108

the information is present in the source: https://github.com/calzada/PARLAMINT-ES-MC/blob/7d8412564b9686b396376d33f5cd9befb009f3c4/CD.sample/CD150120.xml#L52

cd2parmamint.xsl needs to be improved:

parlamint2root.xsl needs changes too:

reported here: https://github.com/clarin-eric/ParlaMint/issues/696#issue-1765368729

rdelibanoc commented 1 year ago

@matyaskopp we need confirmation of this. How do we proceed to fix this error, do we change it in the source, or shall we change it in the Tei xml. I think we should change it in the source, does this mean we run the markup and annotation process from scratch.

matyaskopp commented 1 year ago

We want to change the scripts, as I described in the issue.

You can start with data that is present in CD format:


side note: The general idea of adding new sources that I used in adding government members is

rdelibanoc commented 1 year ago

As you can see from the snapshot, in the original files for the "chair" we have the POST (e.g. Presidente) but we do not have the name.

CleanShot 2023-07-18 at 14 12 14@2x

Can we make do without this information (pliz)?

matyaskopp commented 1 year ago

the name is in the <chair> element and is the same for the whole meeting I guess: https://github.com/calzada/PARLAMINT-ES-MC/blob/7d8412564b9686b396376d33f5cd9befb009f3c4/CD.sample/CD150120.xml#L52

calzada commented 1 year ago

Yes but we cannot acceso to the name since our transcripts only soy MR PRESIDENT without further details. Best Mc

El mar, 18 jul 2023, 14:17, Matyáš Kopp @.***> escribió:

the name is in the element and is the same for the whole meeting I guess:

https://github.com/calzada/PARLAMINT-ES-MC/blob/7d8412564b9686b396376d33f5cd9befb009f3c4/CD.sample/CD150120.xml#L52

— Reply to this email directly, view it on GitHub https://github.com/calzada/PARLAMINT-ES-MC/issues/24#issuecomment-1640104486, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AREQWA7M27O4VXRKJH7LXQZ5G3ANCNFSM6AAAAAA2IPDLP4 . You are receiving this because you were mentioned.Message ID: @.***>

matyaskopp commented 1 year ago

so I have no idea what

<chair who="JESÚS POSADA MORENO"> 

means in CD files. According to wikipedia, Jesús Posada was a chairman...

calzada commented 1 year ago

Yes but he start the sesión as chairman and the chair may change in the middle of it. This is why it is not reliable. We could easily add this person as chair but it May turn out the chair is not always him. Best Mc

El mar, 18 jul 2023, 17:52, Matyáš Kopp @.***> escribió:

so I have no idea what

means in CD files. According to wikipedia, Jesús Posada was a chairman... — Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you were mentioned.Message ID: ***@***.***>
matyaskopp commented 1 year ago

element body allows multiple chair elements https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD.sample/cd.dtd#L26 so I expected that if the speaker change happened, there would be a new chair element.

I have checked your source CD files, and only one chair element exists. So you are claiming there that there is no chairman change. If we use CD files as a source, it is probably better to propagate this error (if it is an error, there are no chairman changes).

If there is no chairman change in source pdf(?) files, then you should expect that there was no chairman change in the chamber of deputies.

TomazErjavec commented 1 year ago

Hi! the way it was in 2.1 is to have one chair per session, this is the condition:

https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/bin/cd2parmamint.xsl#L431

calzada commented 1 year ago

Dear Matyas, This dtd applies to all our working parliaments (CD, EP, HC). But since there are divergencies between and among parliaments, the dtd must accommodate for all of them. Best, mc

El mar, 18 jul 2023 a las 19:29, Matyáš Kopp @.***>) escribió:

element body allows multiple chair elements

https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD.sample/cd.dtd#L26 so I expected that if the speaker change happened, there would be a new chair element.

I have checked your source CD files, and only one chair element exists. So you are claiming there that there is no chairman change. If we use CD files as a source, it is probably better to propagate this error (if it is an error, there are no chairman changes).

If there is no chairman change in source pdf(?) files, then you should expect that there was no chairman change in the chamber of deputies.

— Reply to this email directly, view it on GitHub https://github.com/calzada/PARLAMINT-ES-MC/issues/24#issuecomment-1640663100, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ARERM65YBYIU7CFBAPP3XQ3BZHANCNFSM6AAAAAA2IPDLP4 . You are receiving this because you were mentioned.Message ID: @.***>

matyaskopp commented 1 year ago

Can you give me a sample of chairman changes? I have found only a few, where PRESIDENTA changes to PRESIDENTE, but it looks more like a typo in the source (wrong gender). eg https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD/CD230214.xml there are these chairs:

53  PRESIDENTA,UNKNOWN
1   PRESIDENTE,UNKNOWN
16  VICEPRESIDENTA,Elizo Serrano, María Gloria
13  VICEPRESIDENTA,Pastor Julián, Ana María
12  VICEPRESIDENTE,Rodríguez Gómez de Celis, Alfonso

There is only one occurrence of PRESIDENTE, so we can probably say it is a typo and set MERITXELL BATET LAMAÑA https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD/CD230214.xml#L69 as chair. But I can be wrong - the idea is to record what is in the transcriptions - we can say it is truth...

calzada commented 1 year ago

It is not a typo. It is a situation where the chair changes. Regarding VICEPRESIDENTS it is not quite sure the take the chair, although they may have taken it. Best mc

El mié, 19 jul 2023 a las 12:17, Matyáš Kopp @.***>) escribió:

Can you give me a sample of chairman changes? I have found only a few, where PRESIDENTA changes to PRESIDENTE, but it looks more like a typo in the source (wrong gender). eg https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD/CD230214.xml there are these chairs:

53 PRESIDENTA,UNKNOWN 1 PRESIDENTE,UNKNOWN 16 VICEPRESIDENTA,Elizo Serrano, María Gloria 13 VICEPRESIDENTA,Pastor Julián, Ana María

There is only one occurrence of PRESIDENTE, so we can probably say it is a typo and set MERITXELL BATET LAMAÑA

https://github.com/calzada/PARLAMINT-ES-MC/blob/be3e2be3cf70619c4cd8513b90b19df4d54db87d/CD/CD230214.xml#L69 as chair. But I can be wrong - the idea is to record what is in the transcriptions - we can say it is truth...

— Reply to this email directly, view it on GitHub https://github.com/calzada/PARLAMINT-ES-MC/issues/24#issuecomment-1641819013, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AREQ7257NPCYSVSVAHE3XQ6X5JANCNFSM6AAAAAA2IPDLP4 . You are receiving this because you were mentioned.Message ID: @.***>

calzada commented 1 year ago

@matyaskopp You could equally use "UNKNOWN". This is what we did at ECPC (our research group.