Closed aufdenkampe closed 3 years ago
Copying from https://github.com/LimnoTech/HSPsquared/issues/9#issuecomment-697012154 for our records.
From May 19 "HSP2 test files from LimnoTech" email from @steveskrip to @rheaphy, and my response, for our records:
I’m still having a hard time getting LimnoTech’s HSPF .uci files parsed through the readUCI function. I did get my python path plumbing in order, so thanks for your help with that.
I think it might be best if I send the files over your way to have a look. I imagine there are just some minor differences in string formatting that aren’t being handled. If you do notice small changes that can be made on the .uci file side, we can do that to get things moving, but I know the goal here is to handle any HSPF .uci file.
The files are in LimnoTech’s tests branch of the GitHub repository (https://github.com/LimnoTech/HSPsquared/tree/develop/tests). The two tests are GRW_Plaster and ZRW_WestIndian. Let me know if you’d like me to send them over to you another way.
My emailed response:
I’ve issued a pull request to Bob for all of Steve test files. See https://github.com/respec/HSPsquared/pull/34. Bob, once you review and merge this PR, then you’ll be able to work with Steve’s files in your
develop
branch.
Bob merged PR https://github.com/respec/HSPsquared/pull/34 into their develop on May 22. See the PR conversation for some additional details.
In Bob's June 2 "HSP2 Status" email, he writes:
Last week I finished fixing the known issues with the UCI reader - but looking at the GRW_Plaster UCI file, I found tables that I had not previously found in the my other test cases. On trying to make a quick fix, I found that the fix was too complicated for long term maintenance. I rewrote a section of the code and testing has gone smoothly.
I plan to release the new version in a few hours.
From June 2-13, Bob made three commits that refactored readUCI
. For the list, see https://github.com/respec/HSPsquared/pull/41.
@steveskrip, let's confirm that these fixes work for us. I merged all these updates into https://github.com/LimnoTech/HSPsquared.
@rheaphy, it looks like @steveskrip discovered some additional issues when trying to run the standard HSPF tests that @PaulDudaRESPEC suggested in his comment to respect #31: Expand & automate testing system!
@steveskrip provides detailed information in https://github.com/LimnoTech/HSPsquared/issues/16.
You'll see that the issue also includes problems with readHBN
.
I'm getting an error with reading in this UCI file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.UCI
Here's a copy of the error message:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-e0f78d821958> in <module>
----> 1 HSP2tools.readUCI(uciname, HDFname)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in readUCI(uciname, hdfname)
122 if line[0:3] == 'EXT': ext(info, getlines(f))
123 if line[0:6] == 'PERLND': operation(info, getlines(f),'PERLND')
--> 124 if line[0:6] == 'IMPLND': operation(info, getlines(f),'IMPLND')
125 if line[0:6] == 'RCHRES': operation(info, getlines(f),'RCHRES')
126
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in operation(info, llines, op)
374 history[dpath[op,table],dcat[op,table]].append((table,df))
375
--> 376 (_,df) = history['GENERAL','INFO'][0]
377 valid = set(df.index)
378 for path,cat in history:
IndexError: list index out of range
I'm getting errors with reading in this WDM file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/KDTWMet-06272019-KOS_w_Mon17Filled_CHLA_ComDO.wdm
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _list_to_arrays(data, columns, coerce_float, dtype)
563 try:
--> 564 columns = _validate_or_indexify_columns(content, columns)
565 result = _convert_object_array(content, dtype=dtype, coerce_float=coerce_float)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _validate_or_indexify_columns(content, columns)
688 raise AssertionError(
--> 689 f"{len(columns)} columns passed, passed data had "
690 f"{len(content)} columns"
AssertionError: 9 columns passed, passed data had 10 columns
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
<ipython-input-11-a67c96be9d33> in <module>
----> 1 HSP2tools.readWDM('KDTWMet-06272019-KOS_w_Mon17Filled_CHLA_ComDO.wdm', HDFname)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readWDM.py in readWDM(wdmfile, hdffile)
118
119
--> 120 dfsummary = pd.DataFrame(summary, index=summaryindx, columns=columns)
121 store.put('TIMESERIES/SUMMARY',dfsummary, format='t', data_columns=True)
122 return dfsummary
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
507 if is_named_tuple(data[0]) and columns is None:
508 columns = data[0]._fields
--> 509 arrays, columns = to_arrays(data, columns, dtype=dtype)
510 columns = ensure_index(columns)
511
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in to_arrays(data, columns, coerce_float, dtype)
522 return [], [] # columns if columns is not None else []
523 if isinstance(data[0], (list, tuple)):
--> 524 return _list_to_arrays(data, columns, coerce_float=coerce_float, dtype=dtype)
525 elif isinstance(data[0], abc.Mapping):
526 return _list_of_dict_to_arrays(
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _list_to_arrays(data, columns, coerce_float, dtype)
565 result = _convert_object_array(content, dtype=dtype, coerce_float=coerce_float)
566 except AssertionError as e:
--> 567 raise ValueError(e) from e
568 return result, columns
569
ValueError: 9 columns passed, passed data had 10 columns
@aufdenkampe @steveskrip @bcous I just checked in some refinements to the developWaterQuality branch that I believe resolves issues with reading UCI, WDM, and HBN files -- you might want to try them out!
@PaulDudaRESPEC, thank you!
@steveskrip & @bcous, I merged all this into LimnoTech's develop-WaterQuality
and develop-WaterQuality-BC
branches.
Unfortunately, I had a merge conflict when I tried to cherry-pick the individual commit into our develop
branches.
@PaulDudaRESPEC, since we all just decided to focus on Water Quality modules, I'm wondering if it's time we merge all WaterQuality into develop
and then delete the develop-WaterQuality
branch. That would simplify the git tracking substantially for me (and all of us). What do you think?
@aufdenkampe , I'm on board with having only one development branch during this current effort.
I have tested this with the same files as yesterday, and it appears to be working better. readUCI completed with no problem on the file I linked yesterday. There may still be an issue with the readWDM. It reads in 3 of the files correctly, but appears to hang up on this WDM: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/RPO_SWMM48LINKS2017_wCBOD_June2019.wdm
When running in Jupyter notebooks, it never completes. It appears to add timeseries to the .h5 file (file is larger after it starts running), but it never updates the summary table. Let me know if you want to see the .h5 files and I can find a way to transfer them to you.
Thanks @bcous for testing again!
That great news that the readUCI
worked on your file!
@steveskrip, could you test those other files? (but do it from the develop-WaterQuality
branch)
It's good to know that the readWDM
did better. @PaulDudaRESPEC, any ideas?
Tomorrow morning, I'll work with @PaulDudaRESPEC to merge water quality into develop
.
Hi @PaulDudaRESPEC --
I was chatting with @aufdenkampe about this issue. He suggested that it might be related to the 15-minute data in the WDM file. I checked and in at least 2 of the other WDMs that were read in there were timeseries with 15-minute flow data included as well. Let me know if you want to chat about specifics further.
Thanks, @bcous , that's good to know. I've asked Jack, the WDM guru, to take a look.
Circling back to this one... Jack took a look and noted that at least one of the problematic data sets, DSN 772, appears to have been compiled at various time steps -- daily, 15min, and annual, all in the same timeseries. Looks like the old WDM Fortran code knows how to deal with that, but not the python code. Until we have a fix, I suggest a work-around might be to build the data set from scratch at a 15min time step throughout.
@PaulDudaRESPEC, that is very helpful to know. Thank you!
@PaulDudaRESPEC, any updates on whether you or Jack might be able to fix readWDM.py
to read files with many different time intervals? We're trying to pick up an old HSPF model created by others, so we can't rebuild the files from scratch.
Jack is looking at it. I think he's on the trail, but we've haven't solved it yet.
My thought about rebuilding the files from scratch is that you could list the problematic timeseries in something like the SARA Timeseries Utility, save the list to a text file, and then re-import the data from the text file. But I'm not sure if you'd lose anything critical in the process.
@aufdenkampe and @bcous I know it has been a while since we've provided any news on this issue. Jack is continuing to work on it. This morning he committed a change to readWdm.py (in the develop branch) -- this new version definitely helps, but we're not sure it totally solves the issue, more testing is in order. As a general explanation of what's going on, it looks like there's a compression functionality in the Fortran WDM code that wasn't implemented in the python port -- WDM files that use that functionality are much much larger when converted to HDF5 files -- perhaps underappreciated design elements of that old code!
@PaulDudaRESPEC, thanks for the update, and thanks to you and @jlkittle for your first round of fixes with dddd759681bb28fce611e82eede470ec4945244c and f190fd8c3067a32245e3d363379b3844617a96bf!
That's really interesting to hear that its connected to different compression routines in the Fortran WDM code. We noticed with @bcous's project that those WDM files created massively bigger HDF5 files. I've been thinking that we might be able to do better with the HDF5 compression. In fact, the last work by @rheaphy including exploring better HSP2 performance by using BLOSC compression with the HDF5 files, as he described here: https://github.com/respec/HSPsquared/issues/36#issuecomment-697107682. It might be useful to pick up where he left off.
I was trying to use readUCI on this file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/model_files/GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.UCI
The following error messages came up when I tried to run it.
Thanks,
Brendan
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-b83acb602a13> in <module>
----> 1 get_ipython().run_line_magic('timeit', 'HSP2tools.readUCI(uciname, HDFname)')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2325 kwargs['local_ns'] = self.get_local_scope(stack_depth)
2326 with self.builtin_trap:
-> 2327 result = fn(*args, **kwargs)
2328 return result
2329
<decorator-gen-54> in timeit(self, line, cell, local_ns)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magics\execution.py in timeit(self, line, cell, local_ns)
1167 for index in range(0, 10):
1168 number = 10 ** index
-> 1169 time_number = timer.timeit(number)
1170 if time_number >= 0.2:
1171 break
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magics\execution.py in timeit(self, number)
167 gc.disable()
168 try:
--> 169 timing = self.inner(it, self.timer)
170 finally:
171 if gcold:
<magic-timeit> in inner(_it, _timer)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in readUCI(uciname, hdfname)
143 if line[0:6] == 'PERLND': operation(info, getlines(f),'PERLND')
144 if line[0:6] == 'IMPLND': operation(info, getlines(f),'IMPLND')
--> 145 if line[0:6] == 'RCHRES': operation(info, getlines(f),'RCHRES')
146
147 colnames = ('AFACTR', 'MFACTOR', 'MLNO', 'SGRPN', 'SMEMN', 'SMEMSB',
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in operation(info, llines, op)
566 df = concat([temp[1] for temp in history[path, cat]], axis='columns')
567 df = fix_df(df, op, path, ddfaults, valid)
--> 568 df.to_hdf(store, f'{op}/{path}/{cat}{count}', data_columns=True)
569 else:
570 print('UCI TABLE is not understood (yet) by readUCI', op, cat)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in to_hdf(self, path_or_buf, key, mode, complevel, complib, append, format, index, min_itemsize, nan_rep, dropna, data_columns, errors, encoding)
2447 data_columns=data_columns,
2448 errors=errors,
-> 2449 encoding=encoding,
2450 )
2451
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in to_hdf(path_or_buf, key, value, mode, complevel, complib, append, format, index, min_itemsize, nan_rep, dropna, data_columns, errors, encoding)
268 path_or_buf, mode=mode, complevel=complevel, complib=complib
269 ) as store:
--> 270 f(store)
271 else:
272 f(path_or_buf)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in <lambda>(store)
260 data_columns=data_columns,
261 errors=errors,
--> 262 encoding=encoding,
263 )
264
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in put(self, key, value, format, index, append, complib, complevel, min_itemsize, nan_rep, data_columns, encoding, errors, track_times)
1127 encoding=encoding,
1128 errors=errors,
-> 1129 track_times=track_times,
1130 )
1131
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _write_to_group(self, key, value, format, axes, index, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, nan_rep, data_columns, encoding, errors, track_times)
1799 nan_rep=nan_rep,
1800 data_columns=data_columns,
-> 1801 track_times=track_times,
1802 )
1803
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in write(self, obj, axes, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, nan_rep, data_columns, track_times)
4236 min_itemsize=min_itemsize,
4237 nan_rep=nan_rep,
-> 4238 data_columns=data_columns,
4239 )
4240
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _create_axes(self, axes, obj, validate, nan_rep, data_columns, min_itemsize)
3863
3864 blocks, blk_items = self._get_blocks_and_items(
-> 3865 block_obj, table_exists, new_non_index_axes, self.values_axes, data_columns
3866 )
3867
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _get_blocks_and_items(block_obj, table_exists, new_non_index_axes, values_axes, data_columns)
3986 blk_items = get_blk_items(mgr, blocks)
3987 for c in data_columns:
-> 3988 mgr = block_obj.reindex([c], axis=axis)._mgr
3989 blocks.extend(mgr.blocks)
3990 blk_items.extend(get_blk_items(mgr, mgr.blocks))
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
307 @wraps(func)
308 def wrapper(*args, **kwargs) -> Callable[..., Any]:
--> 309 return func(*args, **kwargs)
310
311 kind = inspect.Parameter.POSITIONAL_OR_KEYWORD
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in reindex(self, *args, **kwargs)
4030 kwargs.pop("axis", None)
4031 kwargs.pop("labels", None)
-> 4032 return super().reindex(**kwargs)
4033
4034 def drop(
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
4460 # perform the reindex on the axes
4461 return self._reindex_axes(
-> 4462 axes, level, limit, tolerance, method, fill_value, copy
4463 ).__finalize__(self, method="reindex")
4464
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3871 if columns is not None:
3872 frame = frame._reindex_columns(
-> 3873 columns, method, copy, level, fill_value, limit, tolerance
3874 )
3875
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _reindex_columns(self, new_columns, method, copy, level, fill_value, limit, tolerance)
3919 copy=copy,
3920 fill_value=fill_value,
-> 3921 allow_dups=False,
3922 )
3923
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4528 fill_value=fill_value,
4529 allow_dups=allow_dups,
-> 4530 copy=copy,
4531 )
4532 # If we've made a copy once, no need to make another one
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate)
1274 # some axes don't allow reindexing with dups
1275 if not allow_dups:
-> 1276 self.axes[axis]._can_reindex(indexer)
1277
1278 if axis >= self.ndim:
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3287 # trying to reindex on an axis with duplicates
3288 if not self.is_unique and len(indexer):
-> 3289 raise ValueError("cannot reindex from a duplicate axis")
3290
3291 def reindex(self, target, method=None, level=None, limit=None, tolerance=None):
ValueError: cannot reindex from a duplicate axis
@bcous , just posted a fix for UCI's with multiple GQUALs -- fixes the problem you reported yesterday.
Doing additional testing and ran into an error in running HSP2.main. Error codes listed below:
2021-03-08 11:16:00.66 Processing started for file GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.h5; saveall=True
2021-03-08 11:16:02.67 Simulation Start: 2017-05-01 00:00:00, Stop: 2017-11-01 00:00:00
2021-03-08 11:16:02.67 PERLND P301 DELT(minutes): 15
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-13-63be5facfe88> in <module>
----> 1 HSP2.main(hdfname,saveall=True)
~\Documents\GitHub\limno_HSPsquared\HSP2\main.py in main(hdfname, saveall, jupyterlab)
49
50 # now conditionally execute all activity modules for the op, segment
---> 51 ts = get_timeseries(store,ddext_sources[(operation,segment)],siminfo)
52 flags = uci[(operation, 'GENERAL', segment)]['ACTIVITY']
53 if operation == 'RCHRES':
~\Documents\GitHub\limno_HSPsquared\HSP2\main.py in get_timeseries(store, ext_sourcesdd, siminfo)
204 if row.MFACTOR != 1.0:
205 temp1 *= row.MFACTOR
--> 206 t = transform(temp1, row.TMEMN, row.TRAN, siminfo)
207
208 tname = f'{row.TMEMN}{row.TMEMSB}'
~\Documents\GitHub\limno_HSPsquared\HSP2\utilities.py in transform(ts, name, how, siminfo)
78 pass
79 elif tsfreq == None: # Sparse time base, frequency not defined
---> 80 ts = ts.reindex(siminfo['tbase']).ffill().bfill()
81 elif how == 'SAME':
82 ts = ts.resample(freq).ffill() # tsfreq >= freq assumed, or bad user choice
KeyError: 'tbase'
@PaulDudaRESPEC and @bcous, the tbase
error that @bcous shared in the previous comment was introduced by our work in our develop-readWDM
branch as described in https://github.com/LimnoTech/HSPsquared/issues/21.
The issue has since been fixed, but we found yet another issue in that branch that we are presently working on fixing.
With the recent successful Rewrite readWDM.py to read by data group & block #21, we can properly read all WDM files that we've tested, including those with irregular time series.
All other readUCI issue have been addressed, to our knowledge.
Getting HSP2 to Handle irregular time series input #51 is a separate issue
Closing this issue as we will merge PR #35 (Merge develop_readWDM into develop to read time series by block & group #35) as soon as we resolve a merge conflict.
Hi, The reason I didn't implement compression originally was that HDFView and other third party tools required "registration" of compression algorithms which was so poorly documented that I thought this would be hard for most hydrologists. I expected that the improvements to HDFView would make this either easy or automatic. I didn't want people frustrated that they couldn't view their HDF5 files with standard tools. I have been tracking the HDF tools created for JupyterLab but their progress has been slow. Compression is easy using Pandas/pytables. Bob
On Fri, Jan 22, 2021 at 2:11 PM Anthony Aufdenkampe < @.***> wrote:
@PaulDudaRESPEC https://github.com/PaulDudaRESPEC, thanks for the update, and thanks to you and @jlkittle https://github.com/jlkittle for your first round of fixes with dddd759 https://github.com/respec/HSPsquared/commit/dddd759681bb28fce611e82eede470ec4945244c and f190fd8 https://github.com/respec/HSPsquared/commit/f190fd8c3067a32245e3d363379b3844617a96bf !
That's really interesting to hear that its connected to different compression routines in the Fortran WDM code. We noticed with @bcous https://github.com/bcous's project that those WDM files created massively bigger HDF5 files. I've been thinking that we might be able to do better with the HDF5 compression. In fact, the last work by @rheaphy https://github.com/rheaphy including exploring better HSP2 performance by using BLOSC compression with the HDF5 files, as he described here: #36 (comment) https://github.com/respec/HSPsquared/issues/36#issuecomment-697107682. It might be useful to pick up where he left off.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/respec/HSPsquared/issues/40#issuecomment-765686350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFML2EKVJZ3Z22P7VDRUWFDS3HSZJANCNFSM4RWIVM6A .
This spring @steveskrip noticed that many UCI files successfully used by LimnoTech with HSPF (and created by LimnoTech's WinModel package) would not import with
readUCI
.@rheaphy also noted that there might be time issues in UCI files, because HSPF doesn't really correctly manage time and for HSP2, we're using ISO time standards that track leap seconds and time zones.
Let's use this issue thread to track @rheaphy's work to improve
readUCI
, and our results with testing it.