KingsburyLab / pyEQL

A Python library for solution chemistry
Other
64 stars 17 forks source link

IndexError due to incorrect handling of empty oxidation state guesses #103

Closed xiaoxiaozhu123 closed 6 months ago

xiaoxiaozhu123 commented 6 months ago

I encountered an IndexError when attempting to calculate conductivity of solutions from the pyEQL library. The error occurs in the Solute.from_formula method.

my script for test:

from pyEQL import Solution
s = Solution({'Na+': '2 mmol/L', 'Br-': '2 mmol/L'})
s.equilibrate()
display({ion: c*1000 for ion, c in s.components.items()})
print('pH:', s.pH)
print('c:', s.conductivity.to('uS/cm'))

An exception occured:

File ~/miniconda3/envs/.../lib/python3.11/site-packages/pyEQL/solute.py:124, in Solute.from_formula(cls, formula)
    122     oxi_states = {els[0]: 0.0} if rform in ["O2(aq)", "O3(aq)", "Cl2(aq)", "F2(aq)"] else {}
    123 else:
--> 124     oxi_states = oxi_states[0]
    126 return cls(
    127     rform,
    128     charge=charge,
   (...)
    139     n_elements=len(els),
    140 )

IndexError: tuple index out of range

Environment:

pyEQL Version: 0.12.0 pymatgen Version: 2024.2.8 Python Version: 3.11 Operating System: MacOS 14.0 (23A344)

It seems that the problem lies with the if-else statement, Line 121-124, solute.py.

        oxi_states = pmg_ion.oxi_state_guesses(all_oxi_states=True)
        # TODO - hack to work around a pymatgen bug in Composition
        # https://github.com/materialsproject/pymatgen/issues/3324
        if oxi_states == []:
            oxi_states = {els[0]: 0.0} if rform in ["O2(aq)", "O3(aq)", "Cl2(aq)", "F2(aq)"] else {}
        else:
            oxi_states = oxi_states[0]

When pmg_ion.oxi_state_guesses fails to get any oxidate state of the element, it returns an empty tuple rather than a empty list. So I guess the if statement should be

if not oxi_states:

I am new to this project and I noticed the the TODO comment above. Are these issue relavant?

rkingsbury commented 6 months ago

Hi @xiaoxiaozhu123 , thank you for reporting! The TODO comment is related, and now that pymatgen has been updated the first part of the if statement can probably be removed. (see https://github.com/materialsproject/pymatgen/issues/3324)

However, I think there may still be a separate issue here (e.g., what happens when we cannot get an oxi state guess?).

I ran your code snippet and am able to equiilbrate the solution; it's only when trying to get the conductivity that the error occurs (as you noted). To troubleshoot further, I called [k for k,v in s.components.items()] to get the names of all the solutes, which are

['H2O(aq)', 'Br[-1]', 'Na[+1]', 'NaBr(aq)', 'H[+1]', 'OH[-1]', 'NaOH(aq)', 'O2(aq)', 'HBrO(aq)', 'BrO[-1]', 'Br(aq)', 'Br[-0.33333333]', 'H2(aq)', 'BrO3[-1]', 'BrO4[-1]']

The next steps would be 1) to figure out which of these solutes fails when you call oxi_state_guesses() on it and 2) develop a fix for pyEQL so that Solute.from_formula returns correctly even if oxi_state_guesses fails

I can work on these but I can't promise a fast turnaround; I'd welcome any assistance you're comfortable providing!