plone / Products.CMFPlone

The core of the Plone content management system
https://plone.org
GNU General Public License v2.0
254 stars 191 forks source link

Provide Migration-Story for ZODB with Plone from Python 2 to 3 #2525

Closed pbauer closed 3 years ago

pbauer commented 6 years ago

ZODB itself is compatible with Python 3 but a DB created in Python 2.7 cannot be used in Python 3 without being modified before.

After some evaluation of different approaches (see https://blog.gocept.com/2018/06/07/migrate-a-zope-zodb-data-fs-to-python-3) https://github.com/zopefoundation/zodbupdate#migration-to-python-3 seems to be a good approach.

If you want to contribute to the documentation or implementation of ZODB Python 3 migration for Plone this README provides some introduction and background information that helps you to get started.

We need to:

improvements for zodbupdate that will make migrator's lifes easier:

pbauer commented 6 years ago

See https://github.com/zopefoundation/Zope/pull/285 for the zodbupdate_decode_dict in Zope.

davisagli commented 6 years ago

I played with zodbupdate a bit a week or so ago and it seems promising, but will probably take some work to get great results.

Here's the basic path:

  1. Set up a buildout that includes Plone 5.2 in Python 2, plus the zodbupdate and the following mr.developer checkouts:
  2. Run bin/zodbupdate --pack --convert-py3 -f path/to/Data.fs. This will do an in-place migration of the filestorage so make sure you do it on a copy if you want to keep using it in Python 2.
  3. Copy the filestorage over to a buildout with the py3.cfg build of Plone 5.2, on Python 3.
  4. Start the site with bin/wsgi and look for what works and what doesn't. If you find objects with decode errors, figure out what attributes are the problem (i.e. which ones should not be converted from bytes to str -- a pdb in ZODB.serialize is helpful) and add a zodbupdate mapping for those attributes, rinse, and repeat.

So far I only tried this with a fresh Plone site, so no real data. It would be interesting to try with a real site to get a sense of how long the migration takes in practice.

Some remaining issues I found:

I'm sure there are more issues. Some things that come to mind to pay attention to:

thefunny42 commented 6 years ago

I added the options to migrate zodbupdate to migrate my application to Python3. That worked fine, and we are running Python 3 in production since March. There's some stuff to know:

I doubt migrating a Plone site will be easy, and will depend a lot on the extensions that has been installed and the custom code written, since they should be checked for strings/bytes problems.

icemac commented 6 years ago

Although zodb.py3migrate cannot be used to do the actual migration (see my blog post), it has an analysis step which shows the objects which might need a conversion. Maybe this is easier than the approach to try out and see what breaks. See https://zodbpy3migrate.readthedocs.io/doc.html#upgrade-workflow for the documentation of the analysis step.

frisi commented 6 years ago

We did have some issues with zope.index.text, which has an optimisation that stores non unicode code in Python 2 strings, but uses strings in Python 3 (instead of bytes, which would have been the proper thing to do). We basically made an helper script that would go over the indexes and decode them as "raw-unicode-escape" in the database before doing zodbupdate. This would be the strategy to convert strings in btrees for instance, or anything you cannot target with zodbupdate.

Thanks for sharing your experience @thefunny42! could you please email me this script or post the relevant parts here/as gist so i can use it for documenting the migration of plone sites? Thanks a lot!

frisi commented 6 years ago

i prepared a buildout and documented the process of creating a sample plonesite running python2 and migrate it to python3.

you can find everything under https://github.com/frisi/coredev52multipy/tree/zodbupdate this should help users new to the topic (eg pickles, string handling in python2 VS python3) understand the problem and how to debug and fix problems during migration.

i also started to document the plone-specific problems and possible solutions there. it is pretty much a summary of @davisagli @thefunny42 and @icemac writeups including some information on where to hook into to fix it. i'd like to discuss these in the hangout today with you guys

thefunny42 commented 6 years ago

Some additional information:

def fix_text_index(index):
    if not zope.catalog.text.ITextIndex.providedBy(index):
        return
    words = index.index._docwords
    count = 0
    for k, v in list(words.items()):
        if isinstance(v, str):
            count += 1
            words[k] = v.decode('raw-unicode-escape')
    if count:
        print('Updated {} words.'.format(count))
    return count != 0
davisagli commented 6 years ago

@frisi I've only skimmed your writeup so far, but it looks really great! The same results I was discovering, but much more clearly written.

frisi commented 6 years ago

i removed myself as an assignee as i won't be able to carry on with the zodb-py3 migration in the near future. hope my current findings and documentation will help other contributors to get startet.

@thefunny42 thanks for your comments and fixes on the zodbupdate tickets/PR.

@davisagli could you please have a look at the updated ticket description. i tried to add an overview over the currently known migration tasks and created/linked tickets where i summarized the current state. threre is also a PR with a rough draft of the database migration in https://github.com/plone/documentation/pull/1022. if you feel that there is important information missing please add it to the docs or the list in this ticket description so we do not forget anything

thank you all for you help on this topic and happy migrating ;-)

jensens commented 5 years ago

FYI: I added [zodbupdate] section to buildout.coredev using @davisagli branch (added to sources and auto-checkout.

I used the script to convert an almost vanilla Plone 5.2 py2 DB (p.a.multilingual installed) to py3 and it worked. I did not in-depth testing on the DB, but the content is shown, login works, edit works.

pbauer commented 5 years ago

The updated docs are merged: https://github.com/plone/documentation/blob/5.2/manage/upgrading/version_specific_migration/upgrade_to_python3.rst

jensens commented 5 years ago

At Saltlabs Sprint @dwt and I worked on the migration story for Plone and ZMS.

I would say we now have a good documented working migration story. We could be better in explaining what to do in case of failures and how to fix them, but thats a nice to have and may evolve over time with projects migrated.

jensens commented 5 years ago

Note: The catalog problem is probably solved. It needs one more check.

jensens commented 5 years ago

ad "write the required zodbupdate_decode_dict for all the packages in Plone that need it":

I would say all is done here. But we may need more real life migrations to verify.

pbauer commented 3 years ago

I consider this done. Additional docs are in https://community.plone.org/t/best-practice-documentation-on-zodb-debugging/12778