Open mooneyme opened 2 years ago
Great, I can handle this short term fix by fixing the parquet files by adding the missing rows,
The lookup with the short-term fix is now on Eagle at /shared-projects/dsgrid/tempo.lkup.parquet The long-term fix will be addressed in issue #28
Let's leave this open for reminder of the need for the long term fix.
Update: it's the smallest ~100 counties ~3% counties, only in the AEO Reference Case (no problems in EFS and LDV2035), where the data is "missing" and I'm filling the lookups with id = NA/null. These counties total to 50000 households, 0.03% total US. This is because with the lower sampling and low EV adoption, the stochastic simulation often does not pick up the EVs. In the future, I will look into increasing the sample rate so a small number of EVs will show up, but for now, the load is configured to be ~0 in these counties.
TEMPO is currently missing ~100 counties. This makes up about <1% of the population/households in scattered across the US (<1.3 million households).
This is most likely related to a bug in TEMPO. @ahcyip said "Last I remember, probably some bad join in Julia somewhere causing TEMPO to error out because some bin was blank". Apparently this was on Brian's radar, but they never got around to fixing it.
Short term fix: We apply a post-processing fix on the
load_data_lookup.parquet
to add these missing counties (and their combinations of other dimensions) with NULLdata_id
values.Long term fix: @ahcyip or someone else on the TEMPO team is going to update the next version of the data handoff with this county bug fix resolved.