calzada / PARLAMINT-ES-MC

2 stars 4 forks source link

Conflicting dates of party affiliations #19

Closed TomazErjavec closed 1 year ago

TomazErjavec commented 3 years ago

Now that we have coalition/oppostion information, a new bug has appeared, namely, 5 persons are members of more than one political party at certain times. These are the baddies:

  1. GarcíaJosé
  2. GonzálezMaría
  3. MartínezMaría
  4. MartínMaría
  5. RodríguezMaría

E.g. the first one has:

<affiliation role="member" ref="#party.PP"     from="2015-05-12" to="2019-02-19"/>
<affiliation role="member" ref="#party.PPFORO" from="2017-12-13" to="2018-09-25"/>

It just might be that I have a bug in the conversion to ParlaMint, but it seems more likely that the bug is in the original data. What can we do?

I put the error log for this on https://nl.ijs.si/et/tmp/ParlaMint/Logs/ES-multi-errror.log

What do you think?

calzada commented 3 years ago

I think we should go for "Nothing". At least for the time being. We could mend some issues in s v.3. in the future and fix these bugs. What we should do is leave these issues active so that we remember the bugs in the future.

What do you think about my decision????

Best for now,

mc

TomazErjavec commented 3 years ago

I agree! But I will put it as an issue on ParlaMint as well, so it's documented in the right place. Come to think of it, I have to make an issue for each country, without one, it is not perfect :)

calzada commented 3 years ago

Good, job you are so meticulous!!!!

mc

TomazErjavec commented 3 years ago

Nice comma! Do you know the book "Eats, shoots, and leaves"? (The answer is: panda)

calzada commented 3 years ago

Hahahah. It is getting to that point when commas are failing me!!! P.D. I will remember that: the answer is panda!!!

rdelibanoc commented 1 year ago

Are we still changing this? Or shall we leave it as it is? And if we are changing it, where shall we implement the changes? Which files? @matyaskopp

matyaskopp commented 1 year ago

First, you have to investigate what happened there, whether it is a bug in source data or some MPs migrating between parties multiple times. Party/group affiliation is based on information from meetings - it takes the oldest and newest dates and sets the affiliation period. It can be done better = take the oldest-newest affiliation interval, where no affiliation with a different political party is present. But we have to know, whether this is the issue...

matyaskopp commented 1 year ago

Did anybody have time to take a look at this issue? @calzada @rdelibanoc @OceanicFlight @MonicaAlbini

This issue produces this validation error:

ERROR: multiple party statuses for MartínezMaría on 2021-01-28: Coalition Opposition

And can produce more errors when the whole corpus is validated.

PLEASE, do not touch the code or data in the repository and reply in this thread. Could you let me know what is wrong and how you suggest to fix it?

matyaskopp commented 1 year ago

Now I see the reason. Creating a unique id from first forename and first surname was not a good idea.

These

cat CD/*|sed 's/\r//'| grep 'name>Martínez.*, María'|sort|uniq -c
     36 <name>Martínez Ferro, María Valentina</name>
    145 <name>Martínez Granados, María Carmen</name>
     16 <name>Martínez Rodríguez, María Rosa</name>
     59 <name>Martínez Seijo, María Luz</name>

has been merged to one person: https://github.com/calzada/PARLAMINT-ES-MC/blob/0d77f69704b884e4fa3b4e8827daa4fdd471c14b/ParlaMint-ES.TEI/ParlaMint-ES-listPerson.xml#L6563-L6580

with: https://github.com/calzada/PARLAMINT-ES-MC/blob/0d77f69704b884e4fa3b4e8827daa4fdd471c14b/bin/cd2parmamint.xsl#L652-L660

I will fix this and we will see...

matyaskopp commented 1 year ago

The full name is needed for person identification. Currently, 930 persons in CD are merged into 890 persons (+67 government persons added -> after merging we have 902 persons in listPerson file).

fix generating ids in:

calzada commented 1 year ago

I will do so. But I can only address it on Monday. Sorry I thought it was a past issue. Best Mc

El vie, 4 ago 2023, 11:37, Matyáš Kopp @.***> escribió:

Did anybody have time to take a look at this issue? @calzada https://github.com/calzada @rdelibanoc https://github.com/rdelibanoc @OceanicFlight https://github.com/OceanicFlight @MonicaAlbini https://github.com/MonicaAlbini

This issue produces this validation error:

ERROR: multiple party statuses for MartínezMaría on 2021-01-28: Coalition Opposition

And can produce more errors when the whole corpus is validated.

PLEASE, do not touch the code or data in the repository and reply in this thread. Could you let me know what is wrong and how you suggest to fix it?

— Reply to this email directly, view it on GitHub https://github.com/calzada/PARLAMINT-ES-MC/issues/19#issuecomment-1665323604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ARESZMSJRGZWJAWS7Z33XTS7ENANCNFSM45FG5RZQ . You are receiving this because you were mentioned.Message ID: @.***>

matyaskopp commented 1 year ago

This is now fixed