Closed grantbuster closed 1 year ago
Need to read in the daily MERRA file plus the next day's file here: https://github.com/NREL/nsrdb/blob/745d1c1e738f9f6ba442168fdede15c101084e99/nsrdb/data_model/merra.py#L136
Should be able to test this here with one more MERRA source file: https://github.com/NREL/nsrdb/blob/main/tests/test_data_model.py
@rolson2 It looks like that source data accessed here - https://github.com/NREL/nsrdb/blob/52bae183ebe3c2749990560b7efad6d63d720428/nsrdb/data_model/data_model.py#L1251
and then temporally interpolated here - https://github.com/NREL/nsrdb/blob/52bae183ebe3c2749990560b7efad6d63d720428/nsrdb/data_model/data_model.py#L1275:#L1285.
So if the source_data
property can be modified to return data which includes enough timesteps then the interpolation problem will be fixed.
There are definitely some nuances here but I think this is the high-level idea.
@rolson2 so yeah you're going to have to make the MERRA data handler class pull the current day AND the next day (if available).
The current temporal lin class won't work with this as designed right now. The reindex() method is pretty aggressive and will drop all data from the second day. Here's an example of how to fix that:
import pandas as pd
import numpy as np
ti_native = pd.date_range('20200101', '20200103', freq='1h', closed='left')
ti_new = pd.date_range('20200101', '20200102', freq='15min', closed='left')
data_native = np.arange(len(ti_native))
# last timestep is 2020-01-01 23:45:00
df = pd.DataFrame(data_native, index=ti_native).reindex(ti_new)
print(df)
df = pd.DataFrame(index=ti_new).merge(pd.DataFrame(data_native, index=ti_native), left_index=True, right_index=True, how='outer')
print(df)
print(df.iloc[90:100])
df = df.interpolate('time').ffill().bfill().reindex(ti_new)
arr = df.interpolate('time').ffill().bfill().reindex(ti_new).values
print(df.iloc[90:100])
print(df)
Bug Description Several ancillary variables are retrieved and interpolated from MERRA on a daily basis. Data at the end of the day is forward-filled resulting in some timesteps with constant values. Correct behavior would be to either linearly extrapolate or (ideally) retrieve the MERRA timestep around the current data to interpolate to.
Screenshots
Charge code SETP 10304 71.01.01