springload / wagtaildraftail

🐦📝🍸 Draft.js editor for Wagtail, built upon Draftail and draftjs_exporter
https://pypi.python.org/pypi/wagtaildraftail
MIT License
24 stars 3 forks source link

Provide upgrade path to Wagtail 2.0 #64

Open loicteixeira opened 6 years ago

loicteixeira commented 6 years ago

Now that it is merged into Wagtail 2.0, there's little point in updating this package compatibility. Instead, provide an upgrade path since since are handled a bit differently.

loicteixeira commented 5 years ago

@tacitus @dgkohn, could you paste below the notes I had written (it's in the client's repo) in regard to upgrading from WagtailDraftail to Wagtail 2.0 so everybody can benefit from it?

dgkohn commented 5 years ago

Upgrade considerations

While Draftail was merged into Wagtail 2.0, it isn't a like for like integration.

It might get a bit awkward as some steps need to be completed together (e.g. the data needs to be migrated in order to test the new configuration changes but the configuration cannot be tested/tweaked until the data is migrated) which might require some trial and error.

It would probably be a good idea to first upgrade WagtailDraftail to use Draftail 0.17 (the same as Wagtail 2.0) before upgrading Wagtail in order to keep Draftail and Wagtail upgrade issue separate. Be aware though that the Draftail API changed a fair bit, especially around how sources (i.e. modals) and decorators (i.e. entities) are created in Draftail (around v0.11). You can have a look at WagtailModelChoosers updates to use React 16 to match Wagtail, hook with Wagtail 1.12 rich text features (although the commit here is in relation to Wagtail 2.0) and fix compatibility with Draftail 0.17 (although it's not fully working, see corresponding discussion in case a fix is found).

loicteixeira commented 5 years ago

Hey @dgkohn, thanks for that. There should be some sub sections as well (data needs to be migrated, etc). Would you mind sharing this as well?

dgkohn commented 5 years ago

Glad to help @loicteixeira ! Is this the one you're talking?

Upgrade considerations

A complete rework of the StreamField UI is underway (see wagtail/wagtail#4473) and while the API should remain backward compatibile, this will be something to watch closely. Customisations from /wagtailaddons/wagtailfacelift/ will most certainly need to be updated (it contains more than just StreamField style fixes so it might not be possible to remove it altogether) as the main purpose was to allow deep block nesting to work well but the new UI should handle this much better.

loicteixeira commented 5 years ago

Unfortunately no. It is related to draftail and should be right under the upgrade consideration that you pasted in your first comment. It would explain the different steps (e.g. "convert plugins to rich text features" or "migrate data from JSON format to new format") in details (IIRC there should be about 4).

Maybe it's on an unmerged PR? Otherwise you might want to have a look at my latest commits to dig this up before it's too late because you'll definitely need it when upgrading the project.

Thanks for the help though. I should have copied this here before leaving.

dgkohn commented 5 years ago

Nice @loicteixeira ! I'll take a look and then I'll update my last comment 👍

mojeto commented 5 years ago

Configuring rich text features

Rich text features are now made available via the register_rich_text_features hook and enabled/disabled on a per field basis with the features kwargs or via the OPTIONS key of the WAGTAILADMIN_RICH_TEXT_EDITORS config (although it now expects a list a features, not a dictionary anymore).

  1. Get to know the official documentation on Extending the Draftail editor.
  2. Register our custom features blocks, inline styles and entities via the hook.
    • Make sure to use names which don't clash with built-in features.
    • In addition to registering the plugin (with features.register_editor_plugin('draftail', the_feature_name, the_draftail_features)), make sure to register the converter rules (with features.register_converter_rule('contentstate', the_feature_name, db_conversion) where db_conversion is a dictionary defining the from_database_format and to_database_format).
    • Note that as of Wagtail 2.1, there are some limitation in registering complex entities, see registering custom entities below.
  3. For each editor set WIDGET to wagtail.admin.rich_text.DraftailRichTextArea
  4. (Maybe? It might not be necessary) Remove all default features (the one that get loaded automatically). There isn't an API for it so you might have to set features.default_features to an empty list from within the hook. Note that the loading order of the django apps will matter here (e.g. some other app might add some default features after you reset it).
  5. For each editor update the OPTIONS key to a list of features equivalent to what was defined in the old option.

Migrating the data

Data isn't saved as JSON anymore and will need to be migrated.

When using Draftail, it consistently converts between the database format and the content state (JSON) format. Therefore the migration has to recreate what the editor does upon save:

  1. Create a wagtail.admin.rich_text.converters.contentstate.ContentstateConverter instance, passing the field's defined features as first argument. Note that each field (or rather each editor type) should have its own converter.
  2. Call to_database_format on the converter, passing the field's diserialized value as first argument.

This will have to be executed within a migration. While it shouldn't be too much trouble to build a list of all RichTextFields on all the models and convert the value (either in place or with a temp field, but in any case it will probably need to be saved via direct SQL to avoid the old or new field interfering), it will be a much greater challenge to convert StreamField's

Control rendering of front-end components

Because the content isn't saved as JSON anymore but in some pseudo XML/HTML, it isn't possible to control the rendering of front-end components with decorators.

However it should be possible to convert the raw data back to JSON and then render it as we use to do. This would normally have huge performance implications but because the site is statically rendered, this shouldn't be an issue.

There's two way to go about it, either create a custom richtext filter and use it in lieu of the built-in one in the templates, or subclass RichTextField/RichTextBlock and override the conversion methods (to_python, get_prep_value, from_db_value and value_to_string) to return a custom node inheriting from the built-in RichText node (like wagtaildraftail is doing).

In any case, the same conversion process will have to happen (either in the filter function or in the __html__ method of the custom node) which will involve converting the data back to JSON and then rendering it the way we use to.

Note that such request has come up and a built-in solution might exist by then.

1. Converting the data back to JSON

Important: It assumes that the various fields and blocks have been converted to use the new features functionality. See configuring rich text features.

If the field widget is available and initialised at this point in time, it might simply be a matter of calling translate_value on it with the field value as first argument. Otherwise it can be done manually:

  1. Create a wagtail.admin.rich_text.converters.contentstate.ContentstateConverter instance, passing the field's defined features as first argument.
  2. Call from_database_format on the converter, passing the field's value as first argument.

2. Render the JSON to HTML with our custom config

Recreate what WagtailDraftail's DraftText node does:

  1. Create a draftjs_exporter.html.HTML instance, passing our own config as first argument. For reference, this is how WagtailDraftail loads the config but it will need to be extracted to this project and potentially updated to avoid it clashing with Wagtail's configuration.
  2. Call render on the exporter, passing the field's deserialized value.
mojeto commented 5 years ago

sorry for delay @loicteixeira

loicteixeira commented 5 years ago

No worries. Thank you for sharing it. Hopefuly that might help someone out there :)

dgkohn commented 5 years ago

Thanks @mojeto and sorry @loicteixeira I've missed this one.

mojeto commented 5 years ago

@loicteixeira is there any tool or example how to migrate wagtaildraftail json db representation of rich text content into wagtail 2 html db representation?

thibaudcolas commented 5 years ago

What you want is most likely Wagtail's own ContentstateConverter#to_database_format, which does just that.

Within the normal lifecycle of rich text in Wagtail, it would be configured based on the enabled features of the rich text editor for which content is saved. For a migration, you will need to decide which features to enable (and how to configure them) based on the content to migrate. Wagtail's built-in features are probably a good start.

loicteixeira commented 5 years ago

Exactly.

WagtailDraftail used to save the ContentState JSON to the database, would use it directly for the editor and only convert it to HTML when needed for the front-end.

With Wagtail 2.0 on the other hand, the ContentState only exists for the editor. It will be converted to the "DB Format" upon save then used with limited modifications for the front-end (and converted back to ContentState whenever you use the editor again).

With that in mind, by using ContentstateConverter#to_database_format once on the existing data, it should then be compatible with the Wagtail 2.0 way.

There are 2 things to consider though:

  1. The way you configure a field features changed drastically as well. It used to be all in the Python config file with a structure closely miroring the JS/Draftail config but now it's all as RichText features (enabled per field and configured with hooks). Unfortunately there will be a lot of trial and error to get this right :(
  2. Converting all the RichTextFields will be easy because you can easilly find all the RichText fields on all the models. However converting all the RichTextBlocks will be much harder as you'll need to migrate the StreamField data which has always been tricky.
mojeto commented 5 years ago

thanks @thibaudcolas and @loicteixeira I'm in process of this migration for one of our project. Migrate RichTextField seems to be easy so far. RichTextBlocks in StreamField are hard to find. I would expect Wagtail has a "StreamField block iterator" helper function, but it seems I'll have to create it as well.

loicteixeira commented 5 years ago

Streamfield migrations have always been a pain point 😞

There is an example in the documentation, there are some more in Springload internal Wiki (under Wagtail Tips) although I think I never got to fully write the recursive one (but the idea on how to pack/unpack StreamField safely is there).

You might be able to find some hints in the PR for listing objects uses as it goes through StreamFields, although I suspect it does it in the database.

Lastly there might be a example in the migrations of the project I think you are migrating as I had to replace a block once. You might need to play with the history as the migrations would have been squashed since then.

mojeto commented 5 years ago

Thanks @loicteixeira , https://github.com/BertrandBordage/wagtail/blob/93e5e32762f9eae08473fae666e692a87a6e01d6/wagtail/core/collectors.py helped. I'm working on tools to help with this migration process https://gist.github.com/mojeto/eaad9f10236044bd9df9fb3bef5e1fda After it's all finished and tested I may create a python package for it to help others with migration. The reason I'm thinking about separate package is so I don't have to make wagtaildraftail code Wagtail 2 compatible.

mojeto commented 5 years ago

I struggle a bit with converting extra draftail features into new configuration. For example previous configuration

TERMS_BLOCK_ID = 'TERMS_AND_CONDITIONS_TEXT'
DRAFT_BLOCK_TYPE_TERMS = {'label': 'T&Cs', 'type': TERMS_BLOCK_ID, 'element': 'div', 'className': 'legals'}
DRAFT_EXPORTER_BLOCK_MAP = dict(BLOCK_MAP, **{
    TERMS_BLOCK_ID: {
        'element': 'p',
        'props': {'className': 'legals'},
    },
})

It creates a simple <p class="legals">Some legal text here</p> html output. With help of https://docs.wagtail.io/en/v2.3/advanced_topics/customisation/extending_draftail.html I converted the configuration into

@hooks.register('register_rich_text_features')
def register_terms_feature(features):
    """
    TERMS_AND_CONDITIONS_TEXT
    Registering the `terms` feature, which uses the `div` Draft.js block type,
    and is stored as HTML with a `<p>` tag.
    """
    feature_name = 'terms'
    type_ = 'TERMS_AND_CONDITIONS_TEXT'
    tag = 'p'

    control = {
        'type': type_,
        'label': 'T&Cs',
        'element': 'div',
        'className': 'legals',
    }

    features.register_editor_plugin(
        'draftail', feature_name, draftail_features.BlockFeature(control)
    )

    features.register_converter_rule('contentstate', feature_name, {
        'from_database_format': {tag: BlockElementHandler(type_)},
        'to_database_format': {'block_map': {type_: tag}},
    })

    # Add the feature to the default features list to make it available
    # on rich text fields that do not specify an explicit 'features' list
    features.default_features.append(feature_name)

it produces <p>Some legal text here</p> db output. But I don't know how to add class="legals" html attribute into.

Do I have to create a completely new entity just for a paragraph tag with class attribute? What Am I missing here?

loicteixeira commented 5 years ago

The className attribute that you set within control will only affect the editor since it's only used when you it the register_editor_plugin. If you want it to appear in the DB HTML, you'll need to pass it to the to_database_format as props for the block. Here is an example from DraftJSExporter's documentation BLOCK_TYPES.HEADER_THREE: {'element': 'h3', 'props': {'class': 'u-text-center'}},.

This would be the easiest solution, but before I though of it, I had written about differents solutions which I'll leave further down just in case the above doesn't work or if you need even more control.

It is no longer possible to control the output on the front-end side with Wagtail 2.0 the same way we did with WagtailDraftail (see wagtail/wagtail#4223), mostly because the data isn't saved as JSON anymore but is already in HTML form. With Wagtail 2.0, the front-end HTML isn't much different from the DB HTML. It goes through a few hoops (e.g. to replace links IDs with their actual URLs) but that's pretty much it and it's mostly Regexp based instead of using DraftJSExporter.

Here are a few ideas on how to achieve full control over the front-end presentation:

Still use DraftJSExporter: Create a custom rich text tag/node which will take the DB HTML and turn it back to JSON ContentState (basically running the from_database_format from Wagtail), then turn it back into front-end HTML with your own config of DraftJSExporter (probably the existing one). There is a more detailed description in the Control rendering of front-end components section above. You would have full control but this obviously has some performance drawbacks. It might be acceptable for projects baked to static HTML.

Work with Wagtail's Rewriters: Create a custom rich text tag/node which will use a overriden expand_db_html method to inject your own rewriters (see LinkRewriter example). The main drawback is that it works with regexp so you need to save your T&Cs block to the DB Format with a tag recognisable enough so the Rewriter can pick it up (and of course write the from_database_format accordingly so it still works in the editor).

mojeto commented 5 years ago

Thanks for all the options and explanation @loicteixeira I play with it for some time now to find the best solution to our draftail components I'm trying to migrate. I found expand_db_html method, but It isn't defined in a way it can be extended easily. I can always monkey patch it, but that doesn't seem right. For now I'm going with

features.register_converter_rule('contentstate', feature_name, {
        'from_database_format': {'p[class="legals"]': BlockElementHandler(type_)},
        'to_database_format': {'block_map': {type_: {'element': 'p', 'props': {'class': 'legals'}}}},
    })

but it requires https://github.com/wagtail/wagtail/pull/4527 to be resolved.

mojeto commented 5 years ago

This migration is becoming really painful. I'll need to find a way to remap wagtaildraftail IMAGE feature data to Wagtail embed[embedtype="image"] feature that uses different keys for altText and alignment values before it can be converted from json to db html presentation.

I understand the draftail content state json is converted to db html string for backward compatibility reason. As you @loicteixeira mentioned earlier with wagtaildraftail rich text we have possibilities to control wagtail admin part of feature and final front-end user html part of feature. The inner content state in json contains all the state data feature needs and it isn't in developer's main interest. Now we need to convert all our features data to db html string and back to json - and that is a lot of work. The db html can't be fron-end user html as it's main responsibility is to keep all the data feature needs (including the data for feature internal use and not to be displayed to end user). The feature db html code has to be unique enough so it can be easily parsed back to proper draftail feature also.

All that means we have to create json -> db html converter and back to replicate work that was done for us by content state json. In addition we have to create converters for the db html -> front end user html for advance features and to remove internal feature data. All of that has to be done to keep the same functionality we already have. It doesn't help that expand_db_html action in Wagtail isn't implemented in a way to be easily expandable.

We at Springload seriously consider repurposing this package to provide us DraftailOnlyRichTextField and DraftailOnlyRichTextBlock that is going to utilize standard Draftail features from Wagtail 2, but keep the rich text data in database represented by draftails content state json object. So feature main control it's final end user html without overhead of converting into db html string and back. The same way how it works by using this package prior Wagtail 2.