transportenergy / database

Tools for accessing and maintaining the iTEM model & historical databases
https://transportenergy.rtfd.io
GNU General Public License v3.0
24 stars 8 forks source link

T001: magnitude error in raw data for CHN #57

Closed soniayeh closed 3 years ago

soniayeh commented 3 years ago

Issue raised by @noussan (an error in the T001 data for China, two orders of magnitude of difference for data up to 2001 and after 2002.) in #32.

A temporary initial fix has been dealt with by @khaeru using temporary python code fix (#40)(modifed our python code, specifically file https://github.com/transportenergy/database/blob/master/item/historical/scripts/T001.py)

The longer-term fix is to request the data provider for T001: https://github.com/transportenergy/metadata/blob/be0183f/historical/sources.yaml#L12-L20 our @transportenergy/itf-oecd colleagues to fix their raw data and we eliminate our temporary python code fix.

This would be a good exercise for us.

RachelePoggi commented 3 years ago

If I understood correctly, T001 refers to coastal shipping. I think this issue has been fixed in our database. Maybe you can give a look and see if i looked at the right variable

soniayeh commented 3 years ago

Yes. You are correct. The source (ITF) has corrected the data. Therefore We should remove the temporary code fix in #40 to the script T001 https://github.com/transportenergy/database/blob/master/item/historical/scripts/T001.py without applying the multiplier of 100 for data up to 2001. @khaeru @hlinero

khaeru commented 3 years ago

We should try to keep these discussions streamlined, as far as possible. Currently the information about this issue is split across comments:

The rule of thumb should be to continue with the original issue as far as possible, so I'll reply there.

soniayeh commented 3 years ago

Here is what we will do to fix this issue:

  1. update T001 https://github.com/transportenergy/metadata/tree/master/historical/input with updated data from https://stats.oecd.org/Index.aspx?DataSetCode=ITF_GOODS_TRANSPORT
  2. remove temporary fix https://github.com/transportenergy/database/blob/master/item/historical/scripts/T001.py
  3. regenerate new merged data file

@hlinero will take care of it and post his update here.