bjodah / chempy

⚗ A package useful for chemistry written in Python
BSD 2-Clause "Simplified" License
556 stars 80 forks source link

Balancing reactions with non-integer stoichiometry #211

Open jcumby opened 2 years ago

jcumby commented 2 years ago

Issue

Balancing a reaction using non-integer stoichiometry results in a KeyError: '.'

Example:

chempy.balance_stoichiometry(['NbO2F','FeF3'], ['Nb1.5Fe0.5O3F3'])

results in:

c:\users\jcumby\local documents\custom_python_libs\chempy\chempy\chempy\util\parsing.py in <genexpr>(.0)
    648         infixes = _unicode_infix_mapping
    649     return _formula_to_format(
--> 650         lambda x: "".join(_unicode_sub[str(_)] for _ in x),
    651         lambda x: "".join(_unicode_sup[str(_)] for _ in x),
    652         formula,

KeyError: '.'

Cause

The dict _unicode_sub defined in parsing.py has unicode subscript definitions for all numbers, but not for a dot, leading to the KeyError.

Solution

Unicode does not (I think?) have a subscript dot, so I can't find an obvious solution. Currently I have added a simple '.' to the _unicode_sub dict to solve the KeyError, but this does not give a correct Unicode representation. Hopefully someone more familiar with unicode has a better suggestion!

jeremyagray commented 2 years ago

This is a known issue ( https://github.com/bjodah/chempy/issues/207#issuecomment-1073339171) with the newly updated parser but there is a partial solution linked in the above comment. I blame unicode because there are symbols for everything else. The only resolution at this time is to decide on a “good enough” symbol to use in place of the subscript and superscript decimal point, but there are still localization issues because not everyone uses the same decimal point. I haven’t had time to compare the alternatives and the ones I’ve found were not great. Plus, there is the not small issue of documenting and educating users to use some special dot as a sub/superscript decimal.

The ASCII, HTML, and LaTeX parsers should all handle this and the issue is not present in the last release.

On Tue, Jun 21, 2022 at 10:51 James Cumby @.***> wrote:

Issue

Balancing a reaction using non-integer stoichiometry results in a KeyError: '.'

Example:

chempy.balance_stoichiometry(['NbO2F','FeF3'], ['Nb1.5Fe0.5O3F3'])

results in:

c:\users\jcumby\local documents\custom_python_libs\chempy\chempy\chempy\util\parsing.py in (.0) 648 infixes = _unicode_infix_mapping 649 return _formula_to_format(--> 650 lambda x: "".join(_unicodesub[str()] for _ in x), 651 lambda x: "".join(_unicodesup[str()] for _ in x), 652 formula, KeyError: '.'

Cause

The dict _unicode_sub defined in parsing.py has unicode subscript definitions for all numbers, but not for a dot, leading to the KeyError. Solution

Unicode does not (I think?) have a subscript dot, so I can't find an obvious solution. Currently I have added a simple '.' to the _unicode_sub dict to solve the KeyError, but this does not give a correct Unicode representation. Hopefully someone more familiar with unicode has a better suggestion!

— Reply to this email directly, view it on GitHub https://github.com/bjodah/chempy/issues/211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOQCHS5PQFRZWX3WG67AHOTVQHQH3ANCNFSM5ZM23Z6A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Jeremy A Gray Gray Farms www.grayfarms.org 205.544.4573