Packaging - Githubissues

ocefpaf commented 8 years ago

This PR supersedes #1

@jamesorr This PR does not merge all the Fortran sources into one file. It does treat them separately to build the extension. However, I had to use an older and modified version of gasx.f90 because of some f2py does not support some features used. (See https://github.com/numpy/numpy/issues/4645)

I don't recommend merging with this workaround, but my Fortran is rusty to fix this properly.

Ping @emiliom.

PS0: I separated the commits to make it easy to review, but I can squash it before merging. PS1: Can you enable Travis-CI for this repository?

emiliom commented 8 years ago

James: @ocefpaf (Filipe) and I are restarting the effort to improve the packaging of mocsy in Python, after a few months.

I'll be at the GOA-ON Science workshop at Hobart in two weeks. I saw that in the draft agenda you're scheduled to give a presentation on carbonate system calculations, just before my presentation on the GOA-ON data portal we're developing. It'll be great to have some time to talk then. And hopefully we'll have made more progress by then with @ocefpaf's new pull request! (Thanks, Filipe!)

For reference, see our previous discussions from last November at #2 and issues linked there.

jamesorr commented 8 years ago

Emilio,

It would be great to discuss the changes that Felipe and you are proposing in this pull request at the GOA-ON meeting in Hobart. I like most of them. For others, we need to talk. For instance, I would prefer to make all the routines in the gasx module compliant with f2py rather than commenting them out.

There are also new features (e.g., error propagation, derivatives, and new routines for model comparison) that have yet to be added to the main branch. They should be included though in May.

Unfortunately, I won't be able to do anything about this pull request until mid-May, when I return to Paris. I leave for Hobart in 2 days to attend the preceding meeting (4th symposium on the Ocean in a High-CO2 World),

See you in Tasmania!

Jim

ocefpaf commented 8 years ago

For instance, I would prefer to make all the routines in the gasx module compliant with f2py rather than commenting them out.

@jamesorr By no means I want those workarounds to be merged! I just added them to "make things work." My knowledge of Fortran is limited, but I am trying to fix it.

BTW I already got o2sato back by adding this line. (Sorry if it is hard to follow the changes due to the noise in the white space removal. I will set my text editor to leave them alone next.)

jamesorr commented 8 years ago

No worries about the Fortran. I can make the necessary changes as long as it is very clear to me what you are after. Thanks

ocefpaf commented 8 years ago

I can make the necessary changes as long as it is very clear to me what you are after.

I want to leave the Fortran code as pristine as possible. I even thought about creating another project for the python wrapper that would use this repository as a submodule to avoid making changes here. However, I gave up that idea because I thought that you might want to keep the python wrapper and the Fortran code in the same place.

I believe that the problem we are facing is this one. That is an old limitation in f2py and I am not sure there will be a fix for it anytime soon. Some other functions/routines are commented out just because they depended on the offending one. (If my debugging is correct the offending one is pistonvel.)

jamesorr commented 8 years ago

@ocefpaf: It's a good idea to keep things in one place. I will check into the Fortran issues as soon as I have a chance but that may have to wait for 2 weeks because I'll be away from the lab at 3 big meetings. Thanks for your efforts on this.

ocefpaf commented 8 years ago

but that may have to wait for 2 weeks because I'll be away from the lab at 3 big meetings.

No worries.

Thanks for your efforts on this.

Thanks @emiliom for pushing this.

emiliom commented 8 years ago

Thanks, James. Let's talk at Hobart. I arrive at noon on the 6th and leave very early on the 11th.

I completely agree that the goal is to keep everything in one place, in the mocsy source repository (James').

@ocefpaf: in the meantime, is it possible to create a conda package in a "test channel" off of your mocsy fork's "packaging" branch? If that's doable (I have no idea if you have some kind of test anaconda channel available), that would let me start easily testing general features of what you've done, without impacting anything else. It would also allow me to show that conda package and your wrapping "in action" when I meet with @jamesorr in two weeks. So: no rush, but if it's something you can do w/o too much effort by mid-late next week, that'd be great. If you don't have a such test environment, no worries.

ocefpaf commented 8 years ago

I completely agree that the goal is to keep everything in one place, in the mocsy source repository (James').

Cool. We are all in the same page here.

@ocefpaf: in the meantime, is it possible to create a conda package in a "test channel" off of your mocsy fork's "packaging" branch?

We already have a version in the IOOS channel based on #1. I will update that to this PR early tomorrow.

emiliom commented 8 years ago

We already have a version in the IOOS channel based on #1. I will update that to this PR early tomorrow.

That sounds good. Thanks.

ocefpaf commented 8 years ago

@emiliom the binaries for this PR are available at: http://anaconda.org/IOOS/mocsy/files

PS: Let me know the results if you test them.

emiliom commented 8 years ago

Thanks, @ocefpaf! I'll examine it on friday or the following couple of days.

emiliom commented 8 years ago

I can report back that Filipe's new approach works, and it accomplishes one of the main goals I had: to bring the FORTRAN subroutine documentation strings into Python function docstrings! This is fantastic, as it makes the Python package a lot easier to use by removes the need to be consulting the doc strings on the FORTRAN code.

Of course, another major goal of Filipe's approach was to leave the fortran files as-is (previously he was combining them). That's taken care of.

I re-ran the Jupyter notebook I wrote last November using the new mocsy package. I checked results only quickly and visually, but everything ran, and there didn't seem to be any changes.

So, this is very promising, and a great starting point for James and I when we meet in a week and discuss remaining problems (eg, gasx) and future directions. Thanks @ocefpaf!

ocefpaf commented 8 years ago

@jamesorr indeed the failures here were not related to https://github.com/numpy/numpy/issues/4645, it seems that just a few adjusts to the Fortran code did the trick. However, I do not trust my fixes here! My Fortran is ~20 years old and I never really used it.

I will make a few comments on the code to make it easier to follow.

emiliom commented 8 years ago

@ocefpaf, thanks for this progress!

Just a reminder not to expect any feedback from @jamesorr, or me, until he and I meet in about a week.

ocefpaf commented 8 years ago

Just a reminder not to expect any feedback from @jamesorr, or me, until he and I meet in about a week.

I know. I was just frustrate with my Fortran and decided to take a second crack at it. Hopefully @jamesorr will find something useful in there. (At least it compiles :wink:)

jamesorr commented 8 years ago

Most of the fixes to the code have been implemented in the new intermediate version of mocsy on github (v2.2). A mocsy branch on error propagation and buffer factors has been developed and will be put up on github next week.

Because that branch has many changes, I am closing this PR for now. But I hope we can get back to discussing better packaging for mocsy and to optimizing its use in python.

ocefpaf commented 8 years ago

The boilerplate to package was sent to mocsy twice now and both times the code base changed and the changes were not merged. I probably won't have time to sent it a third but the basic code is in this PR and, if someone wants to pick it up and try it again, I am available to help.

jamesorr commented 8 years ago

@ocefpaf Thanks so much for all your efforts to advance this issue. Unfortunately, I had to close this PR because my local code has evolved much since the PR was first submitted. It was just much easier for me to do so. I don't know how to selectively merge (not keep all changes).

Anyway, just after closing the PR, I reorganized the package structure exactly along the lines you were requesting. All your same boilerplate additions are now included and are identical to what you provided. Please take a look!

Over the newt few days, I'll be making other additions to the code to do error propagation, but it will then reach a stable state. I really hope we can continue pursue this effort together to make it a conda package on the IOOS channel!

ocefpaf commented 8 years ago

Unfortunately, I had to close this PR because my local code has evolved much since the PR was first submitted.

No problem. I understand that. However, I recommend you to commit more often to keep the local dev version in sync with the remote dev version to improve collaborations.

I don't know how to selectively merge (not keep all changes).

I split all the commits to try to make it easy for an eventual cherry-pick. But do what is easiest as I understand that git is a necessary evil. Very complicated and cumbersome sometimes to do simple operations.

Please take a look!

Will do soon.

I really hope we can continue pursue this effort together to make it a conda package on the IOOS channel!

Sure. But note that the IOOS channels is deprecated and all the packages moved to conda-forge. You can read more about conda-forge in this blog post, but we are not keeping up with its growth and some info might be outdated.

jamesorr commented 8 years ago

@ocefpaf That's excellent advice. The remote dev now includes all recent local developments, namely error propagation and partial derivative calculations (buffer factors). Sorry it took so long to bring this up from the local dev, but it represents months of collaborative development. From now on, I'll do better job at keeping the remote current. Another thing the remote dev now includes are all the boilerplate files that you provided. And the overall structure is exactly what you recommended.

I'd be been keen on moving forward to make mocsy installable via conda-forge as soon as possible. It seems all is now in place but I've never done this before and your help would be critical.

One kludge that I had to do to be able to do a 'import mocsy' was to move the mocsy/init.py file to a 'mocsy/tmp' directory. Otherwise it was trying to read that file and it didn't seem to like line 3:

from ._mocsy import (

Do you know why this might be?

ocefpaf commented 8 years ago

I'd be been keen on moving forward to make mocsy installable via conda-forge as soon as possible.

As soon as we get a successful build we can do that. The packaging in conda-forge is fairly easy.

One kludge that I had to do to be able to do a 'import mocsy' was to move the mocsy/init.py file to a 'mocsy/tmp' directory.

Not sure what is going on but if I had to guess you are testing the import by just building the extension, correct? The __init__.py should be in the root of the package otherwise the installation will not work. I just tested (from this PR):

python setup.py sdist  # Creates a source distribution.
cd dist && pip install mocsy-2.0.1.tar.gz

and everything worked. Tests, example, and the notebook.

However, the build is failing b/c of the changes. There are new Fortran sources that must be added here in the proper build order.

In addition to that some of the new (and old) Fortran source is hitting https://github.com/numpy/numpy/issues/4645 again. Because the error mensagem is not informative debugging that is quite painful and time consuming: (1) commenting the source to figure out which one is breaking, (2) edit the source to fix it for f2py, repeat (1) and (2) until all are found. Someone with more Fortran knowledge than me can probably do this quicker. Another alternative is to no use f2py, but there is no "standard" besides f2py out there. (Or maybe a more permissive compiler than gcc, but then packaging on conda-forge will not be possible.)

PS: You need to renamed the file travis.yml back to .travis.yml if you want to activate continuous integration testing. Travis-CI expects a file named .travis.yml. You also need to enable testing in https://travis-ci.org/. Here is an example in my account where, as you can see, the new changes are not building:

https://travis-ci.org/ocefpaf/mocsy/builds

emiliom commented 8 years ago

@jamesorr and @ocefpaf, great to see all this progress on mocsy and its conda packaging!

jamesorr commented 8 years ago

I've produced a new version v2.3.2, now also on github. As suggested by @ocefpaf, the travis.yml has been renamed to .travis.yml and expanded to include the 5 missing source modules. I also expanded setup.py in the same way.

After moving './mocsy/init.py' to '.', I was indeed able to run python and then do an 'import mocsy' without any problem. But the same file needs to be put back into the ./mocsy subdirectory for

python setup.py sdist

to start up correctly. But even with that, the build fails as you mentioned previously.

When I have a little more time I will see if I can get that build to work without the gasx module, and if so gradually bring in the routines in that module to see where the problems lie. It seems strange to me to think that the fortran code is the problem because the mocsy.so package does build with my Makefile (which invokes f2py), and the routines work after importing that into python.

jamesorr commented 8 years ago

Even after removing gasx.f90 from setup.py and __init__.py and running

python setup.py sdist

the build fails. The final output from that command says:

KeyError: 'void'

I had the same result when I tried to greatly simplify the gasx module to include only 1 routine (SUBROUTINE pCO2atm2xCO2) while keeping gasx.f90 in the 2 routines mentioned above. This seems to suggest there is another problem elsewhere.

Any ideas?

ocefpaf commented 8 years ago

It seems strange to me to think that the fortran code is the problem because the mocsy.so package does build with my Makefile (which invokes f2py), and the routines work after importing that into python.

The problem is how f2py, when called to build the extension, enforces some rules that the makefile does not. I am not sure if it those flags you use in the makefile can be injected in the setup.py call. The answer is probably yes, but I never did it that way. I always ended up "fixing" the Fortran code instead.

This seems to suggest there is another problem elsewhere.

When I tried with the new code the code choke on DNAD.f90. Since everything downstream uses that I gave up.

BTW, for a conda package, we can try to use the makefile b/c conda allows us some breathing room for hacking the creation of the package. But that won't be pip installable though.

jamesorr / mocsy

Packaging #3