Open iantei opened 3 weeks ago
I am not sure what you mean by "there is no energy emission available". We do in fact compute the energy consumed in the public dashboard.
Yes, we compute the energy consumed in the public dashboard, but we don't display the energy consumption for ones which have custom label.
We have "Timeseries of energy" metric available for study/program like nrel-commute which uses default labels, i.e. does not have label_options
. However, for the study/program like usaid-loas-ev-openpath which uses custom labels, we are not enlisting the "Timeseries of energy" metric since we just have richMode {"value":"walk", "baseMode":"WALKING", "met_equivalent":"WALKING", "kgCo2PerKm": 0}
, which does not have information about energy calculation in kWH.
The current computation of footprint i.e. CO2 and energy emission in the public dashboard makes use of distance
parameter.
While the computation of footprint in e-mission-common
requires trip
as a parameter calc_footprint_for_trip(trip, mode_label_option)
source code. I am trying to understand how can we pass trip
as a parameter instead of distance
which is a column in the dataframe.
def CO2_footprint_default(df, distance, col):
""" Inputs:
df = dataframe with data
distance = distance in miles
col = Replaced_mode or Mode_confirm
"""
conversion_lb_to_kilogram = 0.453592 # 1 lb = 0.453592 kg
conditions_col = [(df[col+'_fuel'] =='gasoline'),
(df[col+'_fuel'] == 'diesel'),
(df[col+'_fuel'] == 'electric')]
gasoline_col = (df[distance]*df['ei_'+col]*0.000001)* df['CO2_'+col]
diesel_col = (df[distance]*df['ei_'+col]*0.000001)* df['CO2_'+col]
electric_col = (((df[distance]*df['ei_'+col])+df['ei_trip_'+col])*0.001)*df['CO2_'+col]
values_col = [gasoline_col,diesel_col,electric_col]
df[col+'_lb_CO2'] = np.select(conditions_col, values_col)
df[col+'_kg_CO2'] = df[col+'_lb_CO2'] * conversion_lb_to_kilogram
return df
For the default label mapping, we are dependent on the energy_intensity.csv
and mode_labels.csv
- which does not have the required second parameter for baseMode
. Since https://github.com/JGreenlee/e-mission-common/blob/master/src/emcommon/resources/label-options.default.json is added into the e-mission-common repo, would it be a good idea to use this label-option even when label-option is not specified for the program/study in the config file?
We have trip information available in the column of the data frame.
Maybe we can create a dictionary in the required parameter format, and pass into e-mission-common for footprint calculations. Sample trip format from the test_footprint_calculations
fake_trip = {
'distance': 10000,
'start_fmt_time': '2022-01-01',
'start_loc': {'coordinates': [-74.006, 40.7128]}
}
Trying to integrate emcommon.metrics.footprint.footprint_calculations
with the following changes in environment26.dashboard.additions.yml fiel
...
dependencies:
- pip:
...
- git+https://github.com/JGreenlee/e-mission-common@master
Got the below issue -
---> 73 async def get_egrid_region(coords: list[float, float], year: int) -> str | None:
74 """
75 Get the eGRID region at the given coordinates in the year.
76 """
77 if year < 2018:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
This is likely due to the support for Python 3.10 used, which dashboard still uses Python 3.9.
And while trying to use the e-mission-common@0.5.5
, got the following error -
File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/emcommon/metrics/footprint/footprint_calculations.py:63, in calc_footprint_for_trip(trip, mode_label_option)
61 mode_footprint = rich_mode['footprint']
62 if 'transit' in mode_footprint:
---> 63 mode_footprint = get_mode_footprint_for_transit(trip, mode_footprint['transit'])
64 kwh_total = 0
65 kg_co2_total = 0
NameError: name 'get_mode_footprint_for_transit' is not defined
This is strange because I assigned previous tag i.e. 0.5.5, which still has the function defined as get_mode_footprint_for_transit()
while the master makes use of get_transit_intensities_for_trip()
While this gets fixed, I will explore how to get access to the trip data and baseMode, which are the required parameter of the function calc_footprint_for_trip
.
@JGreenlee
Instead of using the @master tag for the e-mission-common, I approached to use : git+https://github.com/louisg1337/e-mission-common@master which resolved the issue of TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
I incorporated the following code changes from https://github.com/JGreenlee/e-mission-common/blob/master/test/metrics/test_footprint_calculations.py test in my Jupyter notebook:
fake_trip = {
'distance': 10000,
'start_fmt_time': '2022-01-01',
'start_loc': {'coordinates': [-74.006, 40.7128]}
}
fake_mode = {'base_mode': 'BUS'}
footprint_energy, footprint_co2 = await emffc.calc_footprint_for_trip(fake_trip, fake_mode)
I am getting the below issue -
get_transit_intensities_for_uace(year, uace, modes, metadata):
---> 43 actual_year = intensities_data['metadata']['year']
TypeError: 'NoneType' object is not subscriptable
It seems to lookup for data in previous year than 2022, and eventually fails after reaching to 2018. Is there any issue with my approach, or should there be better error handling on the calculations side?
``` DEBUG:root:Getting footprint for trip: {'distance': 10000, 'start_fmt_time': '2022-01-01', 'start_loc': {'coordinates': [-74.006, 40.7128]}}, with mode option: {'base_mode': 'BUS'} DEBUG:root:Getting rich mode for label_option: {'base_mode': 'BUS'} DEBUG:root:Rich mode: {'icon': 'bus-side', 'color': '#9240a4', 'met': {'ALL': {'range': [0, inf]}}, 'footprint': {'transit': ['MB', 'RB', 'CB']}} DEBUG:root:Getting mode footprint for transit modes ['MB', 'RB', 'CB'] in trip: {'distance': 10000, 'start_fmt_time': '2022-01-01', 'start_loc': {'coordinates': [-74.006, 40.7128]}} DEBUG:root:Getting mode footprint for transit modes ['MB', 'RB', 'CB'] in year 2022 and coords [-74.006, 40.7128] DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): geocoding.geo.census.gov:443 DEBUG:urllib3.connectionpool:https://geocoding.geo.census.gov:443 "GET /geocoder/geographies/coordinates?x=-74.006&y=40.7128&benchmark=Public_AR_Current&vintage=Census2020_Current&layers=87&format=json HTTP/1.1" 200 4978 DEBUG:root:Getting mode footprint for transit modes ['MB', 'RB', 'CB'] in year 2022 and UACE 63217 WARNING:root:ntd data not available for 2022. Trying 2021. WARNING:root:ntd data not available for 2021. Trying 2020. WARNING:root:ntd data not available for 2020. Trying 2019. WARNING:root:ntd data not available for 2019. Trying 2018. ERROR:root:eGRID lookup failed for 2018. --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[5], line 8 2 fake_trip = { 3 'distance': 10000, 4 'start_fmt_time': '2022-01-01', 5 'start_loc': {'coordinates': [-74.006, 40.7128]} 6 } 7 fake_mode = {'base_mode': 'BUS'} ----> 8 footprint_energy, footprint_co2 = await emffc.calc_footprint_for_trip(fake_trip, fake_mode) 9 print(f"\n {footprint_energy}, {footprint_co2} \n") File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/emcommon/metrics/footprint/footprint_calculations.py:44, in calc_footprint_for_trip(trip, mode_label_option) 42 mode_footprint = dict(rich_mode['footprint']) 43 if 'transit' in mode_footprint: ---> 44 (mode_footprint, transit_metadata) = await emcmft.get_transit_intensities_for_trip(trip, mode_footprint['transit']) 45 merge_metadatas(metadata, transit_metadata) 46 kwh_total = 0 File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/emcommon/metrics/footprint/transit.py:22, in get_transit_intensities_for_trip(trip, modes) 20 year = util.year_of_trip(trip) 21 coords = trip["start_loc"]["coordinates"] ---> 22 return await get_transit_intensities_for_coords(year, coords, modes) File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/emcommon/metrics/footprint/transit.py:30, in get_transit_intensities_for_coords(year, coords, modes, metadata) 28 metadata.update({'requested_coords': coords}) 29 uace_code = await util.get_uace_by_coords(coords, year) ---> 30 return await get_transit_intensities_for_uace(year, uace_code, modes, metadata) File ~/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/emcommon/metrics/footprint/transit.py:43, in get_transit_intensities_for_uace(year, uace, modes, metadata) 40 Log.debug( 41 f"Getting mode footprint for transit modes {modes} in year {year} and UACE {uace}") 42 intensities_data = await util.get_intensities_data(year, 'ntd') ---> 43 actual_year = intensities_data['metadata']['year'] 44 metadata.update({ 45 "data_sources": [f"ntd{actual_year}"], 46 "data_source_urls": intensities_data['metadata']['data_source_urls'], (...) 51 "ntd_ids": [], 52 }) 54 total_upt = 0 TypeError: 'NoneType' object is not subscriptable ```
I have located the issue. It is because only *.py files are being included when emcommon is bundled as a package. Therefore, the resources
folder and all its .json files are missing.
I think I need to adjust the pyproject.toml
Update:
The calc_footprint_for_trip(trip, mode)
is an async function.
Tried approaches to call this sync function:
Called await calc_footprint_for_trip(trip, mode)
directly from the Jupyter notebook, which works perfectly fine.
We use footprint calculation in energy_calculations.ipynb
notebook. This has a function add_energy_impact()
in scaffolding.py , which is synchronous function. We need to make call for calc_footprint_for_trip(trip, mode)
from here.
await calc_footprint_for_trip(trip, mode)
from within the add_energy_impact()
because it gives an error of await only allowed within _async_ function
asyncio.run(calc_footprint_for_trip(trip, mode))
because it gives an error - asyncio.run() cannot. be called from a running event loop.
add_energy_impact()
function to async and used await to call both the add_energy_impact()
from Jupyter notebook, and await to call calc_footprint_for_trip(trip, mode)
from calc_footprint_for_trip()
function. This way we can call the async function calc_footprint_for_trip(trip, mode)
.
Is there any concern with this approach?As discussed, changing add_energy_impact()
to async function makes it convenient to use await to make call from energy_calculations
notebook. And this approach looks good.
Next thing, I want to explore how to figure out the baseMode
associated with the particular mode of commute.
We currently have baseMode
only available for list of Mode, and not Replaced Mode. However, when we are computing the energy and CO2 footprint, we are calculating the energy impact with df['Energy_Impact(kWH)'] = round((df['Replaced_mode_EI(kWH)'] - df['Mode_confirm_EI(kWH)']),3)
, likewise with CO2_Impact.
Even though the list of keys in Mode and Replaced Mode are identical, that's not always the case.
Therefore, we need baseMode
also available for Replaced Mode so that we can compute Energy_Impact and CO2_Impact for Replaced Mode too.
Even though the list of keys in Mode and Replaced Mode are identical, that's not always the case.
In what instances are there a Replaced Mode that does not have a Mode by the same key?
I thought that Replaced Modes were always a subset of Modes
In what instances are there a Replaced Mode that does not have a Mode by the same key? I thought that Replaced Modes were always a subset of Modes
You're correct! There is only a Replaced Mode - No_travel which is different from the list of Mode. No_travel does not need computation of footprint. This should be fine.
One idea to cut down on the wait times to map from mode_confirm
to baseMode
: To get the mapping from mode_confirm
to baseMode
we could extract the mapping once (get the unique mode_confirm
list and generate a local mapping) and then we can use the local mapping to apply to the whole dataframe synchronously, and are only waiting on the call to emcommon
once for each mode_confirm
not once for every row (could be 1000s)
@Abby-Wheelis I think you'd posted a discussion note here. I am unable to see it.
re-writing from memory since GitHub seems to have eaten what I wrote yesterday, @iantei feel free to add if you remember any additional points
There are two general approaches that we could take here:
1) use the list of trips
AIR
regardlessasyncio.gather()
to speed up the iteration while applying the async footprint lookupBoth @iantei and I and leaning towards option 2 at this point, but @shankari do you have any additional thoughts?
some pseudocode for my "local copy of base mode mapping" idea
mapping = {}
for mode in expanded_ct.mode_confirm.unique():
mapping[mode] = await lookup_base_mode(mode)
which can then be used with .apply()
to add the base_mode to the df quickly, and means we only await
once per unique mode, and not once per row.
@iantei and @Abby-Wheelis I think we discussed this in an earlier team meeting. I think we should go with (1).
To address your points:
_to_data_df
to create the dataframe. Please see emission/storage/timeseries/builtin_timeseries.py
to understand how the interfaces work under the hood. And although apply
is a dataframe method, it essentially iterates over the rows under the hood, it is not a highly efficient vectorized operation.What am I missing here?
Either you have to convert trips -> dataframe -> trips or trips -> dataframe. The second seems strictly better since there are fewer conversions
Particularly for this reason and the other points you made I think it does make more sense to use the list, and after @iantei and I had poked through the server code together earlier this week, I think I see a relatively clear path to doing so. I'll move forward with implementing the data gathering piece while @iantei wraps up the other open PRs (#148, #145, #150), and then plan to pass it back off for the visualization piece!
[label_options](https://github.com/e-mission/nrel-openpath-deploy-configs/tree/main/label_options)
to extract the CO2 emission calculations while there is no energy emission available.