modflowpy / flopy

A Python package to create, run, and post-process MODFLOW-based models.
https://flopy.readthedocs.io
Other
507 stars 307 forks source link

bug: pandas read in `mfdataplist` throwing a warning #2190

Closed mnfienen closed 3 months ago

mnfienen commented 4 months ago

Describe the bug Reading a standard external well file stress period file is throwing the following warning: flopy/mf6/data/mfdataplist.py#line=1141: ParserWarning: Length of header or names does not match length of data. This leads to a loss of data with index_col=False.

This is due to a mismatch between assumed columns names and the number of columns read in by pandas.read_csv. It's not clear to me how self._header_names is getting assigned and that's what is being used to assign the names.

To Reproduce Steps to reproduce the behavior:

  1. You can download the model from https://github.com/gmdsi/GMDSI_notebooks/tree/main/models/monthly_model_files_1lyr_newstress

  2. Try to load the simulation with flopy.mf6.MFSimulation.load()

  3. Behold - warnings. (confirmed this is occurring when reading the wel package)

Expected behavior No warning. Also, seems to be reading correctly, but shouldn't warn

Screenshots One per stress period :)

image

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

spaulins-usgs commented 4 months ago

@mnfienen, those parse warnings are occurring when flopy is loading your wel package's stress period data. For simple list data like this, flopy constructs a data header for your data,

  layer row column q 

and then calls pandas.read_csv, passing in that data header and the file containing your data. You are seeing warning messages because your data has more columns than the header, causing pandas.read_csv to return ParserWarnings. In this particular case the ParserWarnings are because your data have an extra column. Here is the first line of your data wel period data:

  1 10 17 -150.0 0.0

As you can see above, the data header has 4 columns, but your data has 5. Am I correct that the extra 0.0 on the end is an auxiliary variable? If so, add something like this:

 AUXILIARY MY_AUX_VAR

to your freyberg6.wel Options block. Flopy will then load the last column of data into your auxiliary variable and Pandas.read_csv will not complain about the extra column.

There may be a way for flopy to turn off these pandas parser warnings. However, since these parser warnings are warning you of a potentially real problem, flopy not loading your auxiliary data, I think they are potentially useful. They would be more useful if the Pandas warning messages specified which data file is being loaded and the expected data columns. Unfortunately these ParserWarnings are all internal to Pandas, but I will look into finding a way for flopy to detect the warnings and include the file path at the end of the warnings.

If you still think these warnings should be removed, or have other suggestions on how to make these warnings more useful let me know.

spaulins-usgs commented 3 months ago

@mnfienen, flopy now traps pandas parser warnings and displays contextual information with the warning including what type of data is being written and what file is being written to. Also, setting flopy's verbosity level to 0 will turn off the parser warnings.

mnfienen commented 3 months ago

awesome - thanks for getting after that @spaulins-usgs !