add `prepare_weo_2022_scenario()`

RMI-PACTA / pacta.scenario.data.preparation

The goal of {pacta.scenario.data.preparation} is to prepare and format all scenario input datasets required to run the {pacta.portfolio.allocate} tool.

https://rmi-pacta.github.io/pacta.scenario.data.preparation/

Other

1 stars 0 forks source link

add `prepare_weo_2022_scenario()` #29

Closed cjyetman closed 5 months ago

cjyetman commented 5 months ago

work towards https://github.com/RMI-PACTA/workflow.scenario.preparation/issues/9
precedes https://github.com/RMI-PACTA/workflow.scenario.preparation/pull/25

⚠️ The raw_data_from_provider/used_in_pacta.scenario_preparation/IEA-EV-dataEV salesCarsProjection-APS.csv file is malformed and needs to be modified for this to work properly.

no differences

waldo::compare(
  arrange(pacta.scenario.preparation::weo_2022, source, scenario, scenario_geography, sector, technology),
  weo_2022,
  tolerance = 1e-15
)

jdhoffa commented 5 months ago

⚠️ The raw_data_from_provider/used_in_pacta.scenario_preparation/IEA-EV-dataEV salesCarsProjection-APS.csv file is malformed and needs to be modified for this to work properly.

When you get a chance, could you elaborate a bit on this? Are @AlexAxthelm and @Antoine-Lalechere aware?

cjyetman commented 5 months ago

⚠️ The raw_data_from_provider/used_in_pacta.scenario_preparation/IEA-EV-dataEV salesCarsProjection-APS.csv file is malformed and needs to be modified for this to work properly.

When you get a chance, could you elaborate a bit on this?

It looks like this (with only the last/"source" column specified on the first two lines)

region,category,parameter,mode,powertrain,year,unit,value,source
China,Projection-APS,EV sales,Cars,BEV,2020,sales,930000,https://www.iea.org/reports/global-ev-outlook-2022/executive-summary
China,Projection-APS,EV sales,Cars,PHEV,2020,sales,230000
China,Projection-APS,EV sales,Cars,BEV,2021,sales,2700000

Are @AlexAxthelm and @Antoine-Lalechere aware?

no, but now they do?

jdhoffa commented 5 months ago

Great. And so I understand, does "needs to be modified for this to work" mean that you will fill in that value for the rest of the rows in the dataset?

Or are you saying you are blocked somehow until that raw file is "modified" to work?

Antoine-Lalechere commented 5 months ago

I added the column source to keep track of our file (prior they were stored on Azure) The IEA-EV-dataEV salesCarsProjection-APS.csv file was extracted from this source: https://www.iea.org/reports/global-ev-outlook-2022 The 8 first columns were downloaded on that page last year and I added the 9th to keep track of it. I let you decide if you want to remove the 9th now that storage is more efficient.

jdhoffa commented 5 months ago

Thanks @Antoine-Lalechere!

cjyetman commented 5 months ago

Great. And so I understand, does "needs to be modified for this to work" mean that you will fill in that value for the rest of the rows in the dataset?

Or are you saying you are blocked somehow until that raw file is "modified" to work?

I'm currently using a modified version of that file to test the import that I'm creating. I have the expectation that we will want to permanently modify the "raw" file on Azure.

Additionally, a handful of the the "raw" files that I have been working with are currently locked up inside of the pactarawdata > scenario_sources > RawScenarioData_FromDropbox.zip. I have the expectation that those directories will be moved into pactarawdata > scenario_sources on Azure.

Both expectations are necessary for all of this stuff to work while pulling directly from Azure, at least as it stands now.

jdhoffa commented 5 months ago

FYI @AlexAxthelm (this is necessary to get the new scenario prep workflow running using Azure data), for future 2022Q4 and 2023Q4 runs of data prep.

Tracked here: https://dev.azure.com/RMI-PACTA/2DegreesInvesting/_workitems/edit/10608