Klimatbyran / garbo

Klimatkollen's data pipeline, processing company sustainability reports
3 stars 6 forks source link

Investigate ESEF format #104

Open LudwikJaniuk opened 5 months ago

LudwikJaniuk commented 5 months ago

Not strictly a code issue with this repo, but I want to put the work so far somewhere visible.

Research question: Many companies report in the ESEF format. Does ESEF contain the data we want about emissions, and if so how can we extract it in a reliable way?

LudwikJaniuk commented 5 months ago

Example: https://www.ssab.com/en/company/investors/reports-and-presentations

Relevant links I'm reviewing:

Quote:

Software developers tend to do one of two things when first approaching XBRL. They make an assumption that it’s “just” XML and consequently underestimate the size of the task they have. Or, they get bogged down in the formal specifications and overestimate the size of the task.

Opportunity: In the best case, if it turns out that ESEF provides a more complete and reliable, machine readable format, we could have a more reliable and efficient way of gathering emissions data about companies without the need of parsing PDFs using AI. AI could most likely still be used to help in many ways, we would simply avoid a few steps.

Risks: On the other hand it is possible that the format is somehow impossible for us to parse, or that companies don't actually report properly in it, or that too few use ESEF.