adelq / thermochem

Useful Python modules for Thermodynamics and Thermochemistry
http://thermochem.readthedocs.io
Other
30 stars 17 forks source link

Adjust usecols length to be within bounds, Pandas deprecation #28

Closed djbower closed 7 months ago

djbower commented 8 months ago

A note is already in the code about this (janaf.py, line 80), and indeed the latest versions of Pandas are not compatible with thermochem. In some cases the data can still be parsed (like H2O in the gas phase), but H2O in the liquid phase raises a parser error. A simple example is below. (Also see https://github.com/pandas-dev/pandas/issues/48127)

>>> from thermochem import janaf
>>> db = janaf.Janafdb()
>>> db.getphasedata(formula='H2O', phase='l')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/thermochem/janaf.py", line 353, in getphasedata
    return JanafPhase(textdata)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/thermochem/janaf.py", line 80, in __init__
    data = pd.read_csv(
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 912, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 577, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1407, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1679, in _make_engine
    return mapping[engine](f, **self.options)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/python_parser.py", line 124, in __init__
    ) = self._infer_columns()
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/python_parser.py", line 560, in _infer_columns
    columns = self._handle_usecols([names], names, num_original_columns)
  File "/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/io/parsers/python_parser.py", line 610, in _handle_usecols
    raise ParserError(
pandas.errors.ParserError: Defining usecols without of bounds indices is not allowed. [1, 2, 3, 4, 5, 6, 7] are out of bounds.
>>> db.getphasedata(formula='H2O', phase='g')
<thermochem.janaf.JanafPhase object at 0x7fb4c4e6a080>
>>> 
djbower commented 8 months ago

Based on the above, Pandas 1.3.5 is the last version that is supported by thermochem. Perhaps this note could be added to the README dependencies for clarity before a fix is eventually issued? (0.17.0<= pandas <= 1.3.5)

>>> from thermochem import janaf
>>> db = janaf.Janafdb()
>>> db.getphasedata(formula='H2O', phase='l')
/Users/dan/Programs/atmodeller/.venv/lib/python3.10/site-packages/pandas/util/_decorators.py:311: FutureWarning: Defining usecols with out of bounds indices is deprecated and will raise a ParserError in a future version.
  return func(*args, **kwargs)
<thermochem.janaf.JanafPhase object at 0x7fabc9a33280>
>>> 
ZGainsforth commented 8 months ago

Yes. I think it would be good to fix this, however, since most folks will be using newer Pandas and with a code modification it should be fine to use a newer Pandas. Would you want to submit a pull request for this too?

djbower commented 8 months ago

Yes, I'm happy to contribute a pull request once I have something working and tested. I've rolled back to an earlier version of Pandas for my current projects due to time constraints, but I will endeavour to provide a fix soon. Thanks.

djbower commented 7 months ago

Fix provided in pull request #30.