opensanctions / crawler-planning

Task tracking for the crawlers we're working on
https://github.com/orgs/opensanctions/projects/2
6 stars 0 forks source link

French official economic interests #165

Open jbothma opened 6 months ago

jbothma commented 6 months ago

Data URL

https://www.hatvp.fr/le-repertoire/

Publisher

HAUTE AUTORITÉ POUR LA TRANSPARENCE DE LA VIE PUBLIQUE

Publisher country/territory code

fr

Type of data

PEPs (Politicall Exposed Persons)

Coverage region

region:Europe

Can you tell us more?

a new potentially interesting list for you : all public officials (aka PEP) in France must publish their economic interests, and it's now available as opendata :https://www.hatvp.fr/le-repertoire/ You'll find in this deposit :

This is a suggestion or request

jbothma commented 5 months ago

So I think the idea is that top leadership of state owned enterprises, and also deputies and senators would be the targets in this dataset.

Unfortunately I don't think they're available as structured data. I only find them mentioned in free text, e.g. if I grep for Rencontre avec. @bgmello did you see any structured data referring to members of government?

The structured data I find seems to be the leaders and workers of the lobbying organisation themselves.

@pudo do we want to try and regex out the PEPs from the free text? Do we want to include the lobbyists meeting them as RCAs?

2024-05-20_16-48

bgmello commented 5 months ago

The PEPs don't appear to be in structured data, from what I understood, this dataset is a list of self declaration for lobbyists.

pudo commented 5 months ago

Hmmm I'm sorry about the bad research I did on this dataset initially. This would be cool to be able to parse, but I don't think we can realistically expect to do it before we've strapped an LLM to zavod somewhere. My proposal is that we abort here and move the card back to the proposals list.

Thank you for doing the work to take it this far, @bgmello!

bgmello commented 5 months ago

No problem, I can try to search for the French PEP data to see if it's available

Arnaudschw commented 1 month ago

this is a critical piece of data for EU / Fr users :) The list of lobbyist is nice but the true heart of this data is the list of public company leaders, detailed under the "dirigeants" I believe this part of the data could be parsed, as it's better structured than the rest of the file.

However, from the look of it, it might be more extensive than expected as some 2nd / 3rd level executives are mentioned, and there's no easy way to remove them or flag them (only the title indicates limited powers)