Open MichaelTiemannOSC opened 2 years ago
This is a highly relevant question, which we have discussed in an expert round. After discussions, we concluded that the cleanest approach would be a clear separation within the ITR tool between the calculation engine and data manipulation modules.
The problem you mentioned would then be tackled by preprocessing modules.
A) Identify a corporate event (Outlier detection module, which requires interaction with an analyst who sets a corporate event flag and provides the correct LEIs of the affected entities) B) Process data accordingly to the corporate event (Time-series preprocessing module)
To A: In the first step, we propose a time series outlier detection logic. Corporate events typically go with a significant change in revenue. If a company shows a big %-change in revenue over a year the company would be flagged. Things to consider here as well are the company's sector as well as its peers. Potentially we could define a sector benchmark as what is considered a "normal" %-change for a given year and then evaluate any deviations from that benchmark for the sector's companies. For the ITR tool, we are using historical data dating back 5 years for the CO2 emissions, so it would be reasonable to also take a 5-year time series for the revenue data. This outlier detection method is good for a first screening to support an analyst, but most likely it will not be able to fully automatize the process to reliably detect corporate events. In a second step, an analyst will have to do the final classification if a corporate event has occurred in addition to proving the corporate identifiers of the affected entities.
To B: We propose to build a synthetic time series dating back 5 years. This synthetic time series should represent a hypothetical world in which the corporate event has already occurred 5 years ago. For example, for a merger, the two affected companies would add up their CO2 emission data, revenue, etc., as if they merged already 5 years ago. This synthetic time series will then be the input for the ITR calculation engine and used to project trajectory data.
In 2016, Cleveland-Cliffs controlled substantially 54.9% of all US-based iron mining. Their two largest customers, AK Steel and ArcelorMittal combined for nearly 60% of overall product revenue and nearly 80% of iron ore product revenue. 5 years later, Cleveland-Cliffs acquired both companies. To meaningfully project trajectory data, historical AK Steel and ArcelorMittal data from 2015-2020 needs to project out from Cleveland-Cliffs data from 2020 onward, but how should this be done? What guidance should go to data preparation about getting the right LEI, ISIN, and other information to stitch together the story of the largest integrated steel producer in the US?
What features should the ITR tool or the Data Commons provide to allow for company acquisitions to deliver meaningful data across such corporate events?