xsd-fu: Upgrade Genshi from 0.7.3 to 0.7.5

rleigh-codelibre commented 3 years ago

With https://github.com/edgewall/genshi/pull/35 having been merged and version 0.7.5 having been released as a followup (https://github.com/edgewall/genshi/releases/tag/0.7.5) we now have a public Genshi release which can support Python 3.9.

This PR updates the embedded Genshi copy from 0.7.3 to 0.7.5.

I have attached (gen.tar.gz), a copy of the generated sources for the existing master branch with Python 3.8, and this PR with Python 3.8 and Python 3.9. The existing master branch with Python 3.9 is omitted since it is broken (which is what this PR is intended to rectify). As you will be able to see for yourselves, there are no code generation differences between the different Genshi and Python versions.

Closes: #126

An alternative to this PR would be to drop the embedded Genshi entirely, and rely on pip to install the latest version. However, I would caution that we have had repeated breakage in the past, which is why we decided to embed it, so that we could fix things locally as needed (and we have needed and used this ability several times). The Python 3.9 breakage is a case in point.

sbesson commented 3 years ago

@rleigh-codelibre we just migrated from Travis CI to GitHub actions. This also means Python 3.9 can be added to the test. Could you merge origin/master into this branch or rebase and update https://github.com/ome/ome-model/blob/master/.github/workflows/maven.yml#L16 to also include Python 3.9?

Looking briefly at the breakages, the driver for all the recent Genshi changes has been to add support for new minor versions of Python 3.x. I think we agree the expectation is that the ome-model component builds only on a range of supported versions and that newer versions will usually require a library upgrade. This might be something worth adding to the README and/or the documentation.

I am still unclear about the advantage of embedding a copy of the upstream source code as opposed to managing Genshi as a Python dependency, pinning/capping as necessary. Is the main advantage the testing of fixes to be submitted upstream in the context of this downstream library?

rleigh-codelibre commented 3 years ago

@sbesson Rebased and updated to include Python 3.9. Do you want to drop 3.6 and/or 3.7 to keep the job count manageable? 28 is a little high.

Regarding the history of Genshi, its embedding predates Python 3 support by three years. If I recall correctly, we had problems with that as well when we moved to Python 2.7. Genshi seemed to lack active maintenance for a long period and we included it out of strict necessity. @joshmoore originally introduced Genshi 0.7 in 262b7c7b4e004ebb5e0865eca270db405f7704eb back in 2015. The Python 3 fixes were pulled in by me starting with 262b7c7b4e004ebb5e0865eca270db405f7704eb from 2018 onward. Genshi maintenance has picked up a bit more in the last year or so with the switch to GitHub, so it may be that the considerations which necessitated the original embedding and later updates are no longer necessary.

I would certainly be happy to drop it if 0.7.5 works upon all the supported platforms using the PyPI/pip-installed build. Could we investigate that as a followup? If I drop it and add it to the requirements, we can assess the impact upon the CI (which will now include Python 3.9).

rleigh-codelibre commented 3 years ago

@sbesson I have rebased all of my other open pull requests to retest with GitHub actions. All are green. The switch is a good move; Travis CI has been suboptimal for a good while at this point.

rleigh-codelibre commented 3 years ago

@sbesson Regarding CI, I think the current approach is testing a lot of different combinations, but is not as effective as it could be in testing what is important. For example, the important part for the code generation is ensuring that the generated code doesn't differ between python versions (for consistency) and compared with the previous commit (to highlight generated source changes for review). But neither are tested. If you check both of these separately, and fail if there are changes between Python versions, and warn if there is a change compared with the previous commit, then you can build with a single Python version since you know that the code you are building is identical, and you'll be notified there is something additional to review if there are changes against the previous commit.

The above would reduce the job count to n Python jobs and m Java jobs which will test effectively, rather than n × m combinations which aren't giving the same coverage. (Not including the Python module, which is already tested separately.)

You could copy the approach taken here with the script_unix_java_gensrc block, and the diffsrc block below, and do that for each supported Python version. Here it's done with Gradle, but the Maven approach is the same. Now you're on GitHub Actions this probably becomes easier than it was with Travis!

I'll get you a PR with Genshi stripped out soon.

sbesson commented 3 years ago

Thanks Roger. The considerations above about the testing of the generated sources make complete sense as a next step and are likely easier to implement now we migrated to GitHub Actions. I have captured it as an issue so that we can come back to it. I do not expect there will be any serious capacity to drive this on the OME side to work in the short-term but I added a few pointers on ways to get this implemented based on our experience with GitHub actions. We can certainly continue the discussion there.

ome / ome-model

xsd-fu: Upgrade Genshi from 0.7.3 to 0.7.5 #127