KingsburyLab / pyEQL

A Python library for solution chemistry
Other
62 stars 14 forks source link

Suggestion for improvement on IonDB formula recognition #136

Closed xiaoxiaozhu123 closed 1 week ago

xiaoxiaozhu123 commented 2 months ago

pyEQL.utils.standarize_formula currently uses pymatgen.Ion.reduced_formula to standarize the formulas of ions. Although this package has included some special handlings of formulas such as acetate (CH3COO3[-1]), there are still some commonly used formulas modified by the func standerize_formula that are not universally accepted. Here are some examples and suggestions: pyEQL standardize formula - Jupyter Notebook

rkingsbury commented 2 months ago

Excellent suggestions, thank you for reporting @xiaoxiaozhu123 . One of the reasons that I chose to implement standardize_formula rather than just directly use the underyling pymatgen method is to give us the ability, within pyEQL to customize handling of special cases (like these), in case it doesn't make sense to change the handling upstream.

So for the examples you mention, I think we should modify standardize_formula to simply override the output of Ion.composition.reduced_formula and return the expected formulae. After that, we can also see about adding these rules to pymatgen upstream. If they are merged, we can later remove the rules from pyEQL.

Would be comfortable opening a PR in pyEQL to implement these changes? If not, could you write code block and paste it here that I can add to standardize_formula (basically a series of if statements), e.g.

if rform == 'H4N[+1]'
    rform = `NH4[+1]
elif rform == ....
...
rkingsbury commented 2 months ago

Changing the behavior of query will be more difficult, but perhaps I could take a similar approach as with standardize_formula, where I implement a custom query method within pyEQL that wraps the underlying query from maggma, but first standardizes any formulae.

rkingsbury commented 1 week ago

@xiaoxiaozhu123 I was just wondering if you had any additional suggested fixes (besides H4N)? I will try to implement a fix for this soon and I'd like to incorporate any other odd formulas at the same time.

rkingsbury commented 1 week ago

Fixed in v1.0.2, now available via pip. Thanks again for reporting!