NA value in column produce size mismatch error in calculating rolling mean

stephen-frank commented 1 year ago

Context:

Docker image built from _hpcrun, which is a few commits ahead of develop
Running predict with Janghyun's "v10_small" models FTLB_FTLBCHWMeterCHWEnergyRate_r1, FTLB_FTLBHWMeterHWEnergyRate_r1, and FTLB_FTLBMainRealPowerTotal_r1
Running the date range 2023-05-16 through 2023-05-23, i.e. one week of data
Data contain (2) NA values for snow depth: one on 5/17 and one on 5/18

Issue:

When I run predict for this date range, all (3) models, I get the following error (basically the same error for all models):

axon::EvalErr: Func failed: pyEval(PySession py,Str stmt); args: (PyMgrSession,Str)
  sys::IOErr: Python failed: operands could not be broadcast together with shapes (697,78) (697,79) 
Traceback (most recent call last):
  File "/usr/src/app/hxpy/hxpy.py", line 67, in run
    self._exec(instr, local_vars)
  File "/usr/src/app/hxpy/hxpy.py", line 95, in _exec
    return exec(code, local_vars, local_vars)
  File "<string>", line 1, in <module>
  File "/wattile/wattile/buildings_processing.py", line 512, in prep_for_rnn
    data = _preprocess_data(configs, data)
  File "/wattile/wattile/buildings_processing.py", line 486, in _preprocess_data
    data = roll_data(data, configs)
  File "/wattile/wattile/buildings_processing.py", line 571, in roll_data
    means.loc[:, :] = sums.values / counts.values
ValueError: operands could not be broadcast together with shapes (697,78) (697,79) 
 [proj_wattile::wattilePythonModelPredict:148]

The most useful part of this error message are the last 4 lines, which show an off-by-one error in array dimensions between sums.values and counts.values and also point to these lines as the culprit: https://github.com/NREL/Wattile/blob/hpc_run/wattile/buildings_processing.py#L567-L571

I traced this back to the (2) NA values for snow depth. If I modify my SkySpark function to scrub out NA values prior to passing data to Python, I do not get the error.

I think this may have something to do with how NA values are encoded in Pandas when passed from SkySpark to Python using hxpy? They may not be getting properly scrubbed as NA or NaN. I have data dumps from Python in both CSV and Pickle formats immediately prior to calling the models. I will send those via email for troubleshooting.

stephen-frank commented 1 year ago

Emailed data files to @JanghyunJK and @haneslinger. Here is where the data files are exported in my SkySpark code:

// Temporary: Persist data to files for troubleshooting prior to prep_for_rnn and predict 
session
  .pyExec("debug_data_csv = pathlib.Path(model_dir) / 'debug_data.csv'")
  .pyExec("predictor_data_frame.to_csv(debug_data_csv, encoding='utf-8')")
  .pyExec("debug_data_pickle = pathlib.Path(model_dir) / 'debug_data.pickle'")
  .pyExec("predictor_data_frame.to_pickle(debug_data_pickle)")

  // Prep data and run prediction
  session
    .pyExec("_, val_df = prep_for_rnn(configs, predictor_data_frame)") // Error occurs here
    .pyExec("results = model.predict(val_df)")

stephen-frank commented 1 year ago

@haneslinger and I determined this might stem from Pandas not being able to handle the Hxpy encoding of NA, hxpy.haystack.na.NA. This may need to be converted to Pandas.NA or some other more friendly "NA" data type. I'm going to take a crack at doing this on the SkySpark side first.

stephen-frank commented 1 year ago

@haneslinger I added the following lines to SkySpark. Per testing in a smaller function, I believe they are correctly converting values of type hxpy.haystack.na.NA to pandas.NA.

  // NA type conversion
  session
    .pyExec("notNA = predictor_data_frame.applymap(lambda v: not isinstance(v, hxpy.haystack.na.NA))")
    .pyExec("predictor_data_frame = predictor_data_frame.where(notNA, pandas.NA)")

The syntax of where() is a bit screwy; basically if the element-wise condition is TRUE it keeps the original value and if the condition is FALSE it replaces the original value with the alternate value specified.

After this test I still get the error. Running for 2023-05-17 specifically:

axon::EvalErr: Func failed: pyEval(PySession py,Str stmt); args: (PyMgrSession,Str)
  sys::IOErr: Python failed: operands could not be broadcast together with shapes (121,78) (121,79) 
Traceback (most recent call last):
  File "/usr/src/app/hxpy/hxpy.py", line 67, in run
    self._exec(instr, local_vars)
  File "/usr/src/app/hxpy/hxpy.py", line 95, in _exec
    return exec(code, local_vars, local_vars)
  File "<string>", line 1, in <module>
  File "/wattile/wattile/buildings_processing.py", line 512, in prep_for_rnn
    data = _preprocess_data(configs, data)
  File "/wattile/wattile/buildings_processing.py", line 486, in _preprocess_data
    data = roll_data(data, configs)
  File "/wattile/wattile/buildings_processing.py", line 571, in roll_data
    means.loc[:, :] = sums.values / counts.values
ValueError: operands could not be broadcast together with shapes (121,78) (121,79) 
 [proj_wattile::wattilePythonModelPredict:155]

So it still seems the inclusion of the NA value messes up the calculation. Example data and configs sent via email.

stephen-frank commented 1 year ago

Added NA type conversion to nrelWattileExt in https://github.com/NREL/nrelWattileExt/pull/40. Issue is still present with NA type conversion in place per comment above.

haneslinger commented 1 year ago

@stephen-frank fixed as of FIX/281. Ready to close?

stephen-frank commented 1 year ago

Yes; thanks.

NREL / Wattile

NA value in column produce size mismatch error in calculating rolling mean #281