springload / draftjs_exporter

Convert Draft.js ContentState to HTML
https://www.draftail.org/blog/2018/03/13/rethinking-rich-text-pipelines-with-draft-js
MIT License
83 stars 21 forks source link
draft-js draftail draftjs-exporter exporter python rich-text

Draft.js exporter

Library to convert rich text from Draft.js raw ContentState to HTML.

It is developed alongside the Draftail rich text editor, for Wagtail. Check out the online demo, and our introductory blog post.

Why

Draft.js is a rich text editor framework for React. Its approach is different from most rich text editors because it does not store data as HTML, but rather in its own representation called ContentState. This exporter is useful when the ContentState to HTML conversion has to be done in a Python ecosystem.

The initial use case was to gain more control over the content managed by rich text editors in a Wagtail/Django site. If you want to read the full story, have a look at our blog post: Rethinking rich text pipelines with Draft.js.

Features

This project adheres to Semantic Versioning, and measures performance and code coverage. Code is checked with mypy.

Usage

Draft.js stores data in a JSON representation based on blocks, representing lines of content in the editor, annotated with entities and styles to represent rich text. For more information, this article covers the concepts further.

Getting started

This exporter takes the Draft.js ContentState data as input, and outputs HTML based on its configuration. To get started, install the package:

pip install draftjs_exporter

We support the following Python versions: 3.8, 3.9, 3.10, 3.11, 3.12, 3.13. For legacy Python versions, find compatible releases in the CHANGELOG.

In your code, create an exporter and use the render method to create HTML:

from draftjs_exporter.dom import DOM
from draftjs_exporter.html import HTML

# Configuration options are detailed below.
config = {}

# Initialise the exporter.
exporter = HTML(config)

# Render a Draft.js `contentState`
html = exporter.render({
    'entityMap': {},
    'blocks': [{
        'key': '6mgfh',
        'text': 'Hello, world!',
        'type': 'unstyled',
        'depth': 0,
        'inlineStyleRanges': [],
        'entityRanges': []
    }]
})

print(html)

You can also run an example by downloading this repository and then using python example.py, or by using our online Draft.js demo.

Configuration

The exporter output is extensively configurable to cater for varied rich text requirements.

# draftjs_exporter provides default configurations and predefined constants for reuse.
from draftjs_exporter.constants import BLOCK_TYPES, ENTITY_TYPES
from draftjs_exporter.defaults import BLOCK_MAP, STYLE_MAP
from draftjs_exporter.dom import DOM

config = {
    # `block_map` is a mapping from Draft.js block types to a definition of their HTML representation.
    # Extend BLOCK_MAP to start with sane defaults, or make your own from scratch.
    'block_map': dict(BLOCK_MAP, **{
        # The most basic mapping format, block type to tag name.
        BLOCK_TYPES.HEADER_TWO: 'h2',
        # Use a dict to define props on the block.
        BLOCK_TYPES.HEADER_THREE: {'element': 'h3', 'props': {'class': 'u-text-center'}},
        # Add a wrapper (and wrapper_props) to wrap adjacent blocks.
        BLOCK_TYPES.UNORDERED_LIST_ITEM: {
            'element': 'li',
            'wrapper': 'ul',
            'wrapper_props': {'class': 'bullet-list'},
        },
        # Use a custom component for more flexibility (reading block data or depth).
        BLOCK_TYPES.BLOCKQUOTE: blockquote,
        BLOCK_TYPES.ORDERED_LIST_ITEM: {
            'element': list_item,
            'wrapper': ordered_list,
        },
        # Provide a fallback component (advanced).
        BLOCK_TYPES.FALLBACK: block_fallback
    }),
    # `style_map` defines the HTML representation of inline elements.
    # Extend STYLE_MAP to start with sane defaults, or make your own from scratch.
    'style_map': dict(STYLE_MAP, **{
        # Use the same mapping format as in the `block_map`.
        'KBD': 'kbd',
        # The `style` prop can be defined as a dict, that will automatically be converted to a string.
        'HIGHLIGHT': {'element': 'strong', 'props': {'style': {'textDecoration': 'underline'}}},
        # Provide a fallback component (advanced).
        INLINE_STYLES.FALLBACK: style_fallback,
    }),
    'entity_decorators': {
        # Map entities to components so they can be rendered with their data.
        ENTITY_TYPES.IMAGE: image,
        ENTITY_TYPES.LINK: link
        # Lambdas work too.
        ENTITY_TYPES.HORIZONTAL_RULE: lambda props: DOM.create_element('hr'),
        # Discard those entities.
        ENTITY_TYPES.EMBED: None,
        # Provide a fallback component (advanced).
        ENTITY_TYPES.FALLBACK: entity_fallback,
    },
    'composite_decorators': [
        # Use composite decorators to replace text based on a regular expression.
        {
            'strategy': re.compile(r'\n'),
            'component': br,
        },
        {
            'strategy': re.compile(r'#\w+'),
            'component': hashtag,
        },
        {
            'strategy': LINKIFY_RE,
            'component': linkify,
        },
    ],
}

See examples.py for more details.

Advanced usage

Custom components

To generate arbitrary markup with dynamic data, the exporter comes with an API to create rendering components. This API mirrors React’s createElement API (what JSX compiles to).

# All of the API is available from a single `DOM` namespace
from draftjs_exporter.dom import DOM

# Components are simple functions that take `props` as parameter and return DOM elements.
def image(props):
    # This component creates an image element, with the relevant attributes.
    return DOM.create_element('img', {
        'src': props.get('src'),
        'width': props.get('width'),
        'height': props.get('height'),
        'alt': props.get('alt'),
    })

def blockquote(props):
    # This component uses block data to render a blockquote.
    block_data = props['block']['data']

    # Here, we want to display the block's content so we pass the `children` prop as the last parameter.
    return DOM.create_element('blockquote', {
        'cite': block_data.get('cite')
    }, props['children'])

def button(props):
    href = props.get('href', '#')
    icon_name = props.get('icon', None)
    text = props.get('text', '')

    return DOM.create_element('a', {
            'class': 'icon-text' if icon_name else None,
            'href': href,
        },
        # There can be as many `children` as required.
        # It is also possible to reuse other components and render them instead of HTML tags.
        DOM.create_element(icon, {'name': icon_name}) if icon_name else None,
        DOM.create_element('span', {'class': 'icon-text'}, text) if icon_name else text
    )

Apart from create_element, a parse_html method is also available. Use it to interface with other HTML generators, like template engines.

See examples.py in the repository for more details.

Fallback components

When dealing with changes in the content schema, as part of ongoing development or migrations, some content can go stale. To solve this, the exporter allows the definition of fallback components for blocks, styles, and entities. This feature is only used for development at the moment, if you have a use case for this in production we would love to hear from you. Please get in touch!

Add the following to the exporter config,

config = {
    'block_map': dict(BLOCK_MAP, **{
        # Provide a fallback for block types.
        BLOCK_TYPES.FALLBACK: block_fallback
    }),
}

This fallback component can now control the exporter behavior when normal components are not found. Here is an example:

def block_fallback(props):
    type_ = props['block']['type']

    if type_ == 'example-discard':
        logging.warning(f'Missing config for "{type_}". Discarding block, keeping content.')
        # Directly return the block's children to keep its content.
        return props['children']
    elif type_ == 'example-delete':
        logging.error(f'Missing config for "{type_}". Deleting block.')
        # Return None to not render anything, removing the whole block.
        return None
    else:
        logging.warning(f'Missing config for "{type_}". Using div instead.')
        # Provide a fallback.
        return DOM.create_element('div', {}, props['children'])

See examples.py in the repository for more details.

Alternative backing engines

By default, the exporter uses a dependency-free engine called string to build the DOM tree. There are alternatives:

The string engine is the fastest, and does not have any dependencies. Its only drawback is that the parse_html method does not escape/sanitise HTML like that of other engines.

Then, use the engine attribute of the exporter config:

config = {
    # Specify which DOM backing engine to use.
    'engine': DOM.HTML5LIB,
    # Or for lxml:
    'engine': DOM.LXML,
    # Or to use the "maximum output stability" string_compat engine:
    'engine': DOM.STRING_COMPAT,
}

Custom backing engines

The exporter supports using custom engines to generate its output via the DOM API. This can be useful to implement custom export formats, e.g. to Markdown (experimental).

Here is an example implementation:

from draftjs_exporter import DOMEngine

class DOMListTree(DOMEngine):
    """
    Element tree using nested lists.
    """

    @staticmethod
    def create_tag(t, attr=None):
        return [t, attr, []]

    @staticmethod
    def append_child(elt, child):
        elt[2].append(child)

    @staticmethod
    def render(elt):
        return elt

exporter = HTML({
    # Use the dotted module syntax to point to the DOMEngine implementation.
    'engine': 'myproject.example.DOMListTree'
})

Type annotations

The exporter’s codebase uses static type annotations, checked with mypy. Reusable types are made available:

from draftjs_exporter.dom import DOM
from draftjs_exporter.types import Element, Props

# Components are simple functions that take `props` as parameter and return DOM elements.
def image(props: Props) -> Element:
    # This component creates an image element, with the relevant attributes.
    return DOM.create_element('img', {
        'src': props.get('src'),
        'width': props.get('width'),
        'height': props.get('height'),
        'alt': props.get('alt'),
    })

Contributing

See anything you like in here? Anything missing? We welcome all support, whether on bug reports, feature requests, code, design, reviews, tests, documentation, and more. Please have a look at our contribution guidelines.

If you just want to set up the project on your own computer, the contribution guidelines also contain all of the setup commands.

Credits

This project is made possible by the work of Springload, a New Zealand digital agency. The beautiful demo site is the work of @thibaudcolas.

View the full list of contributors. MIT licensed.