iati-data-access / iati-flattener

Library to flatten IATI data
GNU Affero General Public License v3.0
1 stars 3 forks source link

Bug with some `value_local` budget values which don't seem to correspond to `value_usd` #9

Closed simon-20 closed 9 months ago

simon-20 commented 1 year ago

Example:

transaction-MO.csv - e.g., value_eur = 34696, value_local = 3.020125507912883e-226 (and similar for other rows):

Screenshot 2023-08-24 092456

Helpful info from @markbrough:

"We use codeforiati/imf-exchangerates which scrapes the exchange rates from the IMF site (NB we use the file "IMF Currencies converted to USD (end of month)"): https://github.com/codeforiati/imf-exchangerates"

Another issue, which seems like it might well be caused by the same bug: in some files, such as e.g., transaction-GF.csv, the value generated for value_local differs in different rows even when value_usd, value_eur, and exchange_rate are all the same:

Screenshot 2023-08-24 091515

simon-20 commented 1 year ago

The bug (as would be expected) also affects the XLSX outputs. An example is the AT.xlsx file:

image

IATI identifier here: CH-FDJP-110.162.567-FIND2021 Incorrect value as it appears in the file: 4.34088451885213E-245 This IATI activity is from file: finddiagnostics-activities.xml

simon-20 commented 1 year ago

Other IATI activity identifiers which are affected by the problem:

XM-DAC-47015-19129_Bilateral_CacaoNet-GlobalNetworkonCacaoGeneticResourcesConservationandUse_Bioversity (in cgiar-activities-grants.xml) XM-DAC-41116-PROJECT-01724 (in un-environment-xm-dac-41116.xml) GB-GOV-13-GCRF-BF-7TNK9LD-GBYPTX3 (in beis-gb-gov-13-op-costs-gcrf.xml) GB-GOV-13-GCRF-BF-7TNK9LD-NLFLATK (also in beis-gb-gov-13-op-costs-gcrf.xml) GB-GOV-13-GCRF-BF-7TNK9LD-YNLLBYF

simon-20 commented 1 year ago

The problematic values in the xlsx files (generated by flask group) derive from the problematic values in the csv files (generated by flask process).

See, for instance, file transaction-AT.csv, line 148-149 of which correspond to an expenditure which comes from the finddiagnostics-activities.xml file. The values for value_usd and value_eur are correct: the overall budget is ~$71.1 million; 0.0001% of this is for country AT, which amounts to ~$71.19, and this is split into two sectors, one with 72% and one 28%, which produces figures of $51.25 and $19.93, which is what we see in the value_usd column for these two rows, and correspondingly for value_eur.

So the problem is only with value_local, which in this case is 3.04048174920934E-10. This figure is found in the csv file.

Interestingly, for this case, this should just be in EUR, so should be the same as value_eur, ~41 EUR.

But also, the value_local figures in the two rows for the sector split do not amount to a 72%/28%. So there is definitely an anomoloy in the code which produces the value_local figure, which is in the FlattenIATIData class.

With this particular case, we see the same value in the csv and xlsx file, because there is no aggregation to do, but it will likely be the same bug causing problems even when figures don't match across csv and xlsx files because that will be due to aggreation.

simon-20 commented 1 year ago

The bug is caused by code in FlatTransaction.make_flattened and similar code in for the creation of budgets. Have fixed and will create PR next week.