bjodah / chempy

⚗ A package useful for chemistry written in Python
BSD 2-Clause "Simplified" License
556 stars 80 forks source link

Expected type 'bool', got 'None' instead. #184

Open matecsaj opened 4 years ago

matecsaj commented 4 years ago

Code such as the following triggers a warning in PyCharm: Expected type 'bool', got 'None' instead.

b = balance_stoichiometry({'Fe', 'O2'}, {'FeO', 'Fe2O3'}, underdetermined=None)

I don't know if this is truly an issue, mentioning this just in case it is helpful. Closing this issue would not hurt my feelings.

bjodah commented 4 years ago

Interesting, I don't have PyCharm installed myself, but perhaps someone else has a clue. We can leave it open. balance_stoichiometry should perhaps be split up into a couple of different functions (one for each case of underdetermined maybe), it has seen some feature creep over time due to various requests to support different behaviors.

matecsaj commented 4 years ago

Quoting https://docs.python.org/3.8/reference/datamodel.html#objects-values-and-types.

None This type has a single value. There is a single object with this value. This object is accessed through the built-in name None. It is used to signify the absence of a value in many situations, e.g., it is returned from functions that don’t explicitly return anything. Its truth value is false.

Booleans (bool) These represent the truth values False and True. The two objects representing the values False and True are the only Boolean objects. The Boolean type is a subtype of the integer type, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings "False" or "True" are returned, respectively.

====

Does the above imply that it would be more pythonic for "underdetermined" to expect True and False instead of True and None? Furthermore, in addition to being pythonic, for backwards compatibility, should None be treated as an alias for False?

bjodah commented 4 years ago

Yes, the API is unconventional, initially (if I remember correctly) it only accepted True/False: returning a parametrized symbolic solution for underdeteremined system (True) or just raise an exception (False). Then there was a request to return a "minimal integer solution", and that's where the None case comes from. Going forward, perhaps 3 different functions are a better choice: balance_stoichiometry_unique balance_stoichiometry_symbolic balance_stoichiometry_canonical_integer Or change the expected type of "underdetermined" keyword argument to be an enum, e.g.:

class UnderdeterminedChoice(enum.Enum):
    UNIQUE = enum.auto()
    SYMBOLIC = enum.auto()
    CANONICAL_INTEGER = enum.auto()

or do all 3 at the same time?

Also I find naming to be hard here, it need to be descriptive, yet concise... I'm all ears.

bjodah commented 4 years ago

Also unique is also not quite true, since on may always scale s solution with an arbitrary constant...

matecsaj commented 4 years ago

I like the idea of using enums. It provides room for expansion and helps keep the list of public functions small.

Underdetermined is not a compound word found in an English or Computer dictionary, so PyCharm flags it as a spelling error. I have a foggy 30-year memory from maths classes that underdetermined was the showstopper of having more unknowns than sets of equations, which prevented the unknowns from being calculated. Does underdetermined have special meaning to a chemist? Is there a more intuitive name for the parameter? Perhaps strategy or constraint?

Also, might a user want to, for example, use "unique" and "canonical_integer" at the same time? When I combine the half-reactions for an electrolytic cell to determine the overall reaction, I want to delete common substances on both sides and not have any fractional coefficients. Perhaps the function should accept a set containing zero or more enums that direct details of its behaviour.

While learning more about Stoichiometry today, I discovered different aspects of balancing gases, ionic half-reactions, overall reactions, etc. Are there more options that you anticipate adding and enums that should be reserved for future use?

Since refactoring appears to be on your mind -- per Wikipedia, Stoichiometry is the calculation of reactants and products in chemical reactions in chemistry. Is balance_stoichiometry kind of like saying balance-balance? Would balance_reaction or do_stoichiometry be more apt names for the function?

A project vision and roadmap would provide context to this discussion guide decisions. Do they exist, and if so, where can I find them?

A caveat, I don't know what would be most intuitive for a real chemist; I'm a programmer.

bjodah commented 4 years ago

https://en.wikipedia.org/wiki/Underdetermined_system I'm not a native speaker, but it looks like "underdetermined" is a word?

Yes, too many unknowns, and the keyword dictates how to handle that case (and no, no special meaning to a chemist).

Regarding half reactions for chemical reactions, sure, in standard form they contain 1 electron (and hence one often gets fractional stoichiometric coefficients for other species), you can always recalculate the potential to match any number of electrons, but that's breaking convention.

Regarding naming of balance_stoichiometry: yes, you're right, calc_stoichiometric_coefficients or balance_reaction are probably better names. Perhaps (I can't remember) I wanted something short, and thought the tautology would give a hint on what kind of balancing the function was meant for, but ended up with something poorly named.

As for a roadmap: no, there's none, mostly because this has mainly been a one man show, and I have had limited time (and even more so nowadays) to devote to the project (open source and all...). So the things I need myself, have usually been the things I have given priority. But I have put this on github, not only with the hope of more people benefiting from it, but also to offer a project where effort can be concerted, with the scope being: "Python tools useful for chemistry".

Renaming things are not too bad in a Python library, since it's easy to keep old functions/function names around for backward compatibility. Naming is hard, and I think my command of the English language suffer some from me not being a native speaker. So a consensus here is all I need to be persuaded that a name needs to be changed.

matecsaj commented 4 years ago

Underdetermined is not found in a standard English dictionary, for example, https://www.dictionary.com/misspelling?term=underdetermined&s=t. The link you posted describes it as a compound word used in mathematics. I am a native English speaker of 56-years and assure you that most people don't know it. Since it also means nothing to a chemist, I recommend it change.

I prefer balance_reaction because it is composed of whole words, it is easy to spell correctly, it is very descriptive, and even someone with basic knowledge of chemistry is apt to understand the meaning.

Thank you for generously making the library available and supporting it so well.

Should you wish to have a conversation about encouraging contributors, let's do a FaceTime, Zoom, or Skype call. Your welcome to contact me via matecsaj DOT Gmail DOT com to arrange.

hanpari commented 4 years ago

@bjodah

I do not know which IDE you use, but the same warning would be probably triggered if you try run mypy over your code.

bjodah commented 4 years ago

@matecsaj I hear you, that sounds very reasonable. I've sent you an email.

@hanpari :+1: yes, I should add mypy to my development setup. Last time I gave it a chance (several years ago) it wasn't a very mature project, but things have definitely been moving forward in this area. I have to admit that I'm a bit out of the loop when it comes to latest developments on static checkers for python. If there's an interest in making progress here, I'm open for setting up a project where we list what static checkers should be enforced, perhaps couple this to a milestone (next release of chempy).