switch-model / switch

A Modern Platform for Planning High-Renewable Power Systems
http://switch-model.org/
Other
129 stars 85 forks source link

Version 2.0.4 (Python 3 compatibility) #114

Closed mfripp closed 5 years ago

mfripp commented 5 years ago

The code doesn't run in Python 3 yet, but this gives a place to discuss updates being done to make Switch compatible with Python 3 while retaining compatibility with Python 2.

(Update: now it does run in Python 3; test away!)

mfripp commented 5 years ago

OK, this now works with Python 3.7 (and probably earlier). I still have a short list of things to do before releasing it, but this should be ready for testing, so I encourage you to give it a good workout before we release it.

For what it's worth, here's what else is on the to-do list before releasing it:

rodrigomha commented 5 years ago

OK, this now works with Python 3.7 (and probably earlier). I still have a short list of things to do before releasing it, but this should be ready for testing, so I encourage you to give it a good workout before we release it.

For what it's worth, here's what else is on the to-do list before releasing it:

  • compare results between Switch 2.0.1 on Python 2 and 2.0.4 on Python 3 with a big model
  • possibly rename hydro variable end instead of next
  • update CHANGELOG.txt
  • figure out how to require rpy2<2.9.0 on Python 2 and any version of rpy2 on Python 3
  • write upgrade script

    • check whether fuel_cost.tab and gen_inc_heat_rates.tab used to require LF line endings on Windows, but now require CRLF; if so, fix these in an upgrade script (see commit 48217a)
    • report changes to local_td cost calculations
    • report changes in line endings for input and output from many modules

I'm planning to run a WECC model in both Python 2 and Python 3 using Gurobi on both. The examples are giving same results right?

mfripp commented 5 years ago

I know the 2.0.4 branch gives the same total cost on the examples under Python 2.7 and 3.7, because run_tests.py succeeds on both, and it compares the total costs to reference values. However, I haven't checked deeper to see if there are other differences.

The costs under 2.0.4 will be different from 2.0.3 or earlier if you are using the local_td module, because the carrying cost of Legacy local T&D was not included in the objective function in 2.0.3 and earlier. However, decisions should be the same in both versions, because the Legacy capacity was included in the available-capacity calculations. I haven't compared decisions across versions though.

mfripp commented 5 years ago

I found some surprisingly big differences in the investment plan between two big models that I ran in Switch 2.0.1 under Python 2 and Switch 2.0.4 under Python 3. But they were both set to stop at 1% optimality gap and their objectives were within 0.1% of each other. So maybe the solution space is just more flat-bottomed than I expected. I'll try again with a smaller model, solving to very tight optimality.

I'm also trying to pin down some dependency issues, which I don't think are Python 3 specific. I've found that pip install pyomo (which is effectively run when you use pip to install Switch) also installs nose and nose tries to put man pages in the data location. Even under Anaconda (on a Mac), this ends up being /usr/local, so the install fails, sort of. I'll see if pip install --user pyomo fixes that. It would be annoying to have to specify that users should do that (or maybe conda install pyomo) before pip install --editable . but may be necessary. Or maybe pip install --editable --user . is possible.

mfripp commented 5 years ago

I figured out a little more about the nose installation issue. For what it's worth, the message you get when you use pip to install switch-model, pyomo or nose (on a Mac, even with Anaconda) is this:

ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/man/man1/nosetests.1' Consider using the --user option or check the permissions.

This arises because setup.py for nose uses a data_files location to copy a man page into the system data directory: distutils.core.setup(..., data_files = [('man/man1', ['nosetests.1'])], ...).

In the Mac system Python installation, the data_files location (set by the operating system) points to an unwritable system directory, which is why pip doesn't work with the system Python (as I noted here). But even on Anaconda, the data directory is in a location that is not normally user-writeable, i.e., /usr/local. Running pip install --user nose doesn't fix this either: the data location is still /usr/local.

You can fix this by any of these approaches:

  1. making /usr/local world-writeable (the approach taken by homebrew, so maybe not too bad), or
  2. running this: pip install --install-option "--install-data=$CONDA_PREFIX/share" nose before installing Switch, or
  3. running this: pip install --install-option "--install-data=$CONDA_PREFIX/share" . (in the Switch directory), or
  4. running conda install -c conda-forge pyomo before installing Switch

None of these are ideal, but option 4 is probably the best.

For one, they all require an extra step (even option 3, since you de facto have to install pandas before you run it). Further, options 1-3 are platform-specific. Option 1 also requires changes that some may not be willing or able to make. Option 2 works pretty well, and man nosetests is even able to show the man page, and pip uninstall finds and removes it. However, that approach deactivates wheel-based installation, and it's messy. Option 3 looks like one-step, but it disables wheel-based installation for all dependencies. So then installation fails for numpy (required by pandas, required by Switch), because there's not a complete C toolchain. So you have to run conda install pandas before option 3, so it's not really simpler. Option 4 works pretty well (users with a working pip can use pip instead of conda or just skip the extra step), but at least is more general. (Users don't get the manpage for nosetests that way, but I'm sure we could live with that!)

I'm surprised Anaconda doesn't report the right install-data location to pip-based installers, but I can't see an easy way to get them to do that. Ugh.

mfripp commented 5 years ago

Actually now I'm thinking we should head toward 3 installation options:

  1. For people who just want a simple installation: a. conda install -c conda-forge switch-model or b. pip install [--user] switch-model (for users who prefer pip and have it working).
  2. For people who want access to the source code: c. use conda or pip to install dependencies, then clone Switch into a local folder and run pip install --editable . or maybe python setup.py <something>.

The key points are that this avoids using pip to install nose (indirectly) in an anaconda environment (since that's semi-broken), it keeps things simple for users who don't want to see the source code, and it is a logical sequence for users who do want to see the source code (or examples or tests).

This fits pretty well with the official Anaconda advice about using pip in a conda environment too.

mfripp commented 5 years ago

I just ran find switch_model -name '*.py' -exec futurize --stage2 {} + and it looks like we still have a ways to go to get good Python 2/3 compatibility (e.g., I didn't correct for the assumption that all dict.items() calls are lists). I'm actually surprised it's running as well as it is, given how many things this turns up.

So you may need to hold off on detailed testing for a little longer.

mfripp commented 5 years ago

OK, now I've addressed all the issues identified by futurize --stage2 (see commit de8124d). Most of them turned out to be pretty minor, and I'm not sure if any of the changes actually prevent any unexpected behavior.

Note: we should keep in mind from now on that dict.iteritems(), dict.iterkeys() and dict.itervalues() are not available, although the equivalent methods on Pyomo components may be, and that dict.items(), dict.keys(), dict.values(), map(), zip(), etc., may return either generators or lists.

mfripp commented 5 years ago

I tried doing comparisons with my "big" model between Switch 2.0.3 on Python 2.7 and Switch 2.0.4 on Python 3.7 (both using Pyomo 5.6.2). However, I'm finding these take too long to solve to a very small mipgap, in order to make a fair comparison. So I took a different approach instead, and carefully compared those environments using a smaller model with the same components.

The biggest problem with comparing across versions is that expressions like '{}'.format(1234567890.12345678901234568) end up with different numbers of digits in Python 2 and 3 (a few more digits in Python 3). That makes it pretty impossible to compare the output from m.pprint() across the versions. So instead I compared the .nl file sent to the solver and the decisions and objective returned by the solver (in .tab and .txt files). The numerical precision in these files doesn't seem to vary across Python versions.

I found that these were all identical, except for total_cost.txt (more digits in Python 3) and the index keys for fuel heat rate curves (these use floats, which were output differently in the two versions).

So I'm willing to go ahead and release this as 2.0.4 whenever you all are.

mfripp commented 5 years ago

A few months ago, I added "module messages" to the upgrade scripts, to report changes in the behavior of any modules when updating from one version to another. Normally it would be a good idea to use those to report that we now use CRLF line endings for output files on Windows and that the objective function now includes the cost of legacy local T&D (so users shouldn't worry if costs go up vs. 2.0.3). However, other than showing these messages, there is no need for a data upgrade script for 2.0.4. So it seems weird to invoke the whole data upgrade apparatus just to show those messages and change the switch_inputs_version.txt to "2.0.4".

I'm inclined just to mention these changes in CHANGELOG.txt and leave it at that. But maybe it's a better principle to always report these changes and always upgrade the version number of the data dirs, even if that's the only thing that changes?

What do you all think?

(OK, after further consideration I added the "upgrade" script after all, just to be thorough. This way we know users will hear about changes in model behavior, and their switch_inputs_version.txt will generally reflect the latest version of Switch they used with it.)

rodrigomha commented 5 years ago

Hi, I finished run some WECC cases (Linear Program with no local_td) using 2.0.3 in Python 2 and 2.0.4 in Python 3. As Matthias mentioned, there is a minor difference on the total cost on the last digit most likely due to the difference on numbers of digits. The difference in the total cost is of 1 dollar over a total cost of 839 billion dollars.

I also did a quick random comparison in the dispatch, and there seems to be minor differences in dispatch at some hours (e.g. differences of 30 MW in a dispatch of 6000 MW), but most of the dispatch from generation coincides. I'm also thinking that the there may be multiple solutions with the same total cost, that explains this minor differences, or the tolerances that is being used in the interior point method and the latter crossover that is being performed by Gurobi.

Given that the total cost is essentially the same, and the solution given is mostly the same, I'm also willing to go ahead and release this 2.0.4 version.

Amazing work Matthias!

mfripp commented 5 years ago

Thanks! I'll go ahead and release this soon then. I'm currently working on getting conda-forge integration working for switch_model too (had to create a 2.0.3.1 version with different distribution settings for that). I'll release this on pypi and conda-forge as soon as I have that working.

josiahjohnston commented 5 years ago

Looks like I'm late to the party. Your work & review looks good to me.

Most of the datasets I've worked on have pretty flat-bottomed solution spaces. The explanations of optimality gap and minor differences in decision variables leading to identical (or nearly-identical) objective values makes sense to me.

The nose install issues sounds annoying. Your recommended approaches of conda or pip install --user ... seems reasonable for casual users. Doing a sudo pip install... also seems reasonable to install system-wide in write-restricted directories.

For Macs, I strongly recommend avoiding the built-in python if at all possible, and install a sane version of python either with Conda or Homebrew. The Mac default python has a long track record of having major issues (including obsolete libraries with security vulnerabilities installed in magic system directories that can't be upgraded, even with sudo), and I gave up on it completely after a few years of trying to come up with workarounds.

I'd recommend advanced users and professional developers to either use conda environments (for GUI-inclined folks) or virtual environments.

Moving forward, I'll test under python 3.7 before posting pull requests.

Do we want to maintain support for python 2 till the end of the year when it stops being supported?

mfripp commented 5 years ago

I don't know if I mentioned this before, but the Switch 2.0 paper is out, at https://doi.org/10.1016/j.softx.2019.100251. That's why I'm rushing out these improvements in Switch, so I can announce it and give people a good experience when they come to download it. Speaking of which, Switch 2.0.4 (the Python 2/3 version) is now available via pip and conda. It's now possible to install this version and all its dependencies just with conda install -c conda-forge switch_model. I think you can also get everything except glpk via pip install switch_model.

The problem with nose is pretty bad -- it's not actually possible to install nose in a conda environment using pip, for nearly the same reason that you can't install it (or ipython) on a Mac's system Python using pip. For some reason, the Mac native Python and anaconda both fail to set a writeable location for data_files referenced in setup.py (typically man pages). I once developed an elaborate and complete workaround to allow everyday work in the Mac system Python, described at https://apple.stackexchange.com/a/223163/143849 . But that was overly complex and I just recommend Anaconda to everyone now. For our problem with nose under conda, the best solution was to install dependencies first using conda, then install Switch using pip. But that's a little convoluted, which is why I wanted to get conda to install Switch itself (which it now does).

About Python 2 support: I think we should continue at least to the end of the year, and maybe till the middle of next year. It would be poor form to tell our users "no Python 3" for so long and then quickly switch to "only Python 3".

josiahjohnston commented 5 years ago

A quick push for python3 support makes sense given the paper being out. I'm loving this rapid & regular release process.

Supporting python2 till at least the end of the year seems good. Past that point, I'm not personally motivated to keep up python2 support if it involves much extra work or having to refrain from using useful python3 syntax. But a concrete use case could probably persuade me otherwise.

Once we officially drop python 2 support, we should do a minor version bump.