opensanctions / crawler-planning

Task tracking for the crawlers we're working on
https://github.com/orgs/opensanctions/projects/2
5 stars 0 forks source link

French official economic interests #165

Open jbothma opened 2 months ago

jbothma commented 2 months ago

Data URL

https://www.hatvp.fr/le-repertoire/

Publisher

HAUTE AUTORITÉ POUR LA TRANSPARENCE DE LA VIE PUBLIQUE

Publisher country/territory code

fr

Type of data

PEPs (Politicall Exposed Persons)

Coverage region

region:Europe

Can you tell us more?

a new potentially interesting list for you : all public officials (aka PEP) in France must publish their economic interests, and it's now available as opendata :https://www.hatvp.fr/le-repertoire/ You'll find in this deposit :

This is a suggestion or request

jbothma commented 1 month ago

So I think the idea is that top leadership of state owned enterprises, and also deputies and senators would be the targets in this dataset.

Unfortunately I don't think they're available as structured data. I only find them mentioned in free text, e.g. if I grep for Rencontre avec. @bgmello did you see any structured data referring to members of government?

The structured data I find seems to be the leaders and workers of the lobbying organisation themselves.

@pudo do we want to try and regex out the PEPs from the free text? Do we want to include the lobbyists meeting them as RCAs?

2024-05-20_16-48

bgmello commented 1 month ago

The PEPs don't appear to be in structured data, from what I understood, this dataset is a list of self declaration for lobbyists.

pudo commented 1 month ago

Hmmm I'm sorry about the bad research I did on this dataset initially. This would be cool to be able to parse, but I don't think we can realistically expect to do it before we've strapped an LLM to zavod somewhere. My proposal is that we abort here and move the card back to the proposals list.

Thank you for doing the work to take it this far, @bgmello!

bgmello commented 1 month ago

No problem, I can try to search for the French PEP data to see if it's available