opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
461 stars 216 forks source link

Upgrade code base to pure Python 3.6+ #972

Open Midnighter opened 4 years ago

Midnighter commented 4 years ago

We decided some time ago to drop Python 2.7 and only support 3.6 and later versions. Our legacy codebase offers many opportunities for cleaning up old syntax.


Check List for Upgrading Modules

matthiaskoenig commented 4 years ago

Can we please just say from now on all new code contributed only has to support py3.6 and later! This would make contribution so much easier instead of having always to add legacy things.

synchon commented 4 years ago

I can work on this one. Is it okay if I assign myself?

Midnighter commented 4 years ago

Absolutely, but please don't feel the need to convert everything at the same time. I think one PR per module would be a good strategy. As the @opencobra/cobrapy-core team, we should also strive to update code that we are anyway working on in order to fix a bug or implement a new feature. So I see this as a kind of continuous process.

synchon commented 4 years ago

Yeah iterative upgrade sounds good. I'll start with the modules that are not being actively worked on right now.

Midnighter commented 4 years ago

A check list of things to do as commented by @matthiaskoenig in https://github.com/opencobra/cobrapy/pull/975#issuecomment-663840911:

I would do all the python3.6 changes at once. I.e.

  • [ ] fix imports
  • [ ] remove object from class definition
  • [ ] remove six usage
  • [ ] remove future imports
  • [ ] remove encodings
  • [ ] use fstrings when possible
  • [ ] type annotations

I would like to add here:

Midnighter commented 4 years ago

One point where we should agree on a common style is logging strings. We can either also use f-strings or C-style string formatting. So either

greeting = "Hello"
whom = "World"
logger.debug(f"{greeting} {whom}!")

or

greeting = "Hello"
whom = "World"
logger.debug("%s %s!", greeting, whom)

The advantage of the f-string is that it's easier to understand. The advantage of the old-style is that string interpolation only occurs if the log level is currently visible. So if you log, for example, an LP formulation to debug, the string interpolation will never happen if the log level is info or higher.

cdiener commented 4 years ago

I would also add using contexts where possible, in particular with multiprocessing.Pool. Right now code like FVA or sampling will leave dead processes on errors.

synchon commented 4 years ago

@Midnighter I find f-strings to be useful (for consistency) if we just have a simple interpolation, like so:

needs_logging = "I need to be logged."

logger.debug(f"Here is a log for you: {needs_logging}")

The performance trade-off is not huge and negligible. On the other hand, if we have something like:

needs_logging = func_to_log()
# log at INFO
logger.debug(f"Here is a log for you: {needs_logging}")

The performance hit would not be negligible if func_to_log() requires some decent computation time and log is only logged at level INFO.

Looking at the code base, I don't find any place where old format style would be beneficial and thus I vote for f-strings.

synchon commented 4 years ago

@cdiener I have improvements for multiprocessing.Pool lying around in my local for quite some time now. Now, that we are moving to Python 3.6+, I'll push those.

matthiaskoenig commented 4 years ago

I clearly prefer f-strings.

Could you give a bit more details about the multiprocessing.Pool issue. Not clear to me what the suggested solution/patterns are here.

On Mon, Jul 27, 2020 at 9:43 AM Synchon Mandal notifications@github.com wrote:

@cdiener https://github.com/cdiener I have improvements for multiprocessing.Pool lying around in my local for quite some time now. Now, that we are moving to Python 3.6+, I'll push those.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/972#issuecomment-664177335, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG33OTWXJC3NRD4D4WHBFLR5UVZPANCNFSM4O67WODQ .

-- Matthias König, PhD. Junior Group Leader LiSyM - Systems Medicine of the Liver Humboldt Universität zu Berlin, Institute of Biology, Institute for Theoretical Biology https://livermetabolism.com konigmatt@googlemail.com https://twitter.com/konigmatt https://github.com/matthiaskoenig Tel: +49 30 2093 98435

cdiener commented 4 years ago

The pool can be used in a with block in Python 3 which will cleanly remove processes even on errors or Keyboard interrupt. This was not possible in Python 2 and could lead to dangling processes (will not free memory sometimes and may raise other issues). Also we should set maxtasksperchild to avoid the memory leak in the optlang glpk solver.

Midnighter commented 4 years ago

I agree, f-strings it is for logging, too. I will make an updated list with all the tasks at the top of this issue so it is clear what needs to be done without reading the whole thread.