GermanZero-de / localzero-generator-core

7 stars 3 forks source link

A command to diff two data files on a single float column. #382

Closed bgrundmann closed 9 months ago

bgrundmann commented 9 months ago

This is a quick command to compare two data files using the same diff logic we use in end to end tests. This was particularly useful while evaluating the new 2021 reference data.

For example to get an idea what are the biggest facts that have changed:

(.venv) benediktgrundmann@Benedikts-Air localzero-generator-core % python devtool.py data diff data/public/facts/20{18,21}.csv -value 3 | he
ad -n 10
at .Fact_I_S_miner_cementchalk_lpg_fec_2018 expected 123888.88888888889 got 107500.0 (-13.23%)
at .Fact_E_P_wind_onshore_pct_of_gep_2018 expected 0.142 got 0.15394951046044425 (8.42%)
at .Fact_I_S_chem_basic_lpg_fec_2018 expected 473888.8888888889 got 423888.8888888889 (-10.55%)
at .Fact_R_S_lpg_fec_2018 expected 10110555.555555556 got 10423333.333333334 (3.09%)
at .Fact_I_P_metal_steel_primary_CO2e_eb_HKR_2018 expected 25741696.914700545 got 26949000.0 (4.69%)
at .Fact_I_S_metal_steel_primary_coal_fec_2018 expected 86656111.1111111 got 94259722.22222222 (8.77%)
at .Fact_B_S_lpg_fec_2018 expected 3007222.222222222 got 2578333.333333333 (-14.26%)
at .Fact_I_S_chem_basic_opetpro_fec_2018 expected 9948611.11111111 got 14431388.888888888 (45.06%)
at .Fact_I_S_biomass_fec_2018 expected 31304444.444444444 got 32683055.555555556 (4.40%)
at .Fact_F_P_jetfuel_prodvol_2018 expected 5101000.0 got 2892000.0 (-43.31%)

Or to get the output in a form that is ready to paste into excel:

(.venv) benediktgrundmann@Benedikts-Air localzero-generator-core % python devtool.py data diff data/public/facts/20{18,21}.csv -value 3 -csv | head -n 10
.Fact_L_G_forest_CO2e_DE_2018,-66995500.0,-41408800.0,-38.191669589748564
.Fact_I_S_other_food_fueloil_fec_2018,1563055.5555555555,1470000.0,-5.9534387773236155
.Fact_I_P_chem_other_prodvol_2018,3813000.0,3870482.4120603013,1.5075376884422051
.Fact_I_S_miner_cementchalk_fueloil_fec_2018,816388.8888888889,764722.2222222222,-6.328683225586929
.Fact_A_S_lpg_fec_2018,2388888.888888889,2250000.0,-5.813953488372097
.Fact_M_CO2e_wo_lulucf_2015,904262000.0,896657872.0,-0.8409208835492369
.Fact_W_P_wastewater_prodvol_2017,1713185.0,1740556.0,1.597667502342129
.Fact_E_P_elec_prodvol_netto_2018,513326944.4444444,496699187.5,-3.239213745625617
.Fact_L_G_wetland_peat_dead_CO2e_2018,29210.0,63400.0,117.0489558370421
.Fact_I_S_other_further_heatnet_fec_2018,9374722.222222222,9036944.444444444,-3.603069720584316

Ready to rock

(.venv) benediktgrundmann@Benedikts-Air localzero-generator-core % python devtool.py ready_to_rock
WARNING: there is a new pyright version available (v1.1.301 -> v1.1.351).
Please install the new version or set PYRIGHT_PYTHON_FORCE_VERSION to `latest`

No configuration file found.
pyproject.toml file found at /Users/benediktgrundmann/Programming/localzero/localzero-generator-core.
Loading pyproject.toml file at /Users/benediktgrundmann/Programming/localzero/localzero-generator-core/pyproject.toml
Assuming Python platform Darwin
Auto-excluding **/node_modules
Auto-excluding **/__pycache__
Auto-excluding **/.*
Searching for source files
Found 201 source files
pyright 1.1.301
0 errors, 0 warnings, 0 informations
Completed in 2.631sec
=========================================================== test session starts ============================================================
platform darwin -- Python 3.10.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /Users/benediktgrundmann/Programming/localzero/localzero-generator-core
plugins: anyio-3.6.2, cov-3.0.0
collected 45 items

tests/test_devtool_commands.py ..............                                                                                        [ 31%]
tests/test_end_to_end.py ......s.s.s.s.s.s                                                                                           [ 68%]
tests/test_entries.py .                                                                                                              [ 71%]
tests/test_refdata.py ....                                                                                                           [ 80%]
tests/test_tracing.py .........                                                                                                      [100%]

====================================================== 39 passed, 6 skipped in 6.09s =======================================================
Trim Trailing Whitespace.................................................Passed
Mixed line ending........................................................Passed
Check for case conflicts.................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
Don't commit to branch...................................................Passed
black....................................................................Passed