apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.93k stars 13.57k forks source link

CSS styling doesn't work in Handlebars chart #30381

Open goldjee opened 10 hours ago

goldjee commented 10 hours ago

Bug description

CSS code defined in "CSS styles" section of Handlebars chart settings isn't applied to HTML template. It is shown as plain text on the chart instead.

How to reproduce the bug

  1. Disable ESCAPE_MARKDOWN_HTML, enable HTML_SANITIZATION, allow CSS with HTML_SANITIZATION_SCHEMA_EXTENSIONS in superset_config.py as follows:
    
    FEATURE_FLAGS = {
    "ESCAPE_MARKDOWN_HTML": False,
    "HTML_SANITIZATION": True,
    }

HTML_SANITIZATION_SCHEMA_EXTENSIONS = { "attributes": { "*": ["style","className"], }, "tagNames": ["style"], }

2. Create a Handlebars chart with one column ("descr" in this example) added from any dataset.
3. Set up HTML template as follows:
```HTML
<ul class="data-list">
  {{#each data}}
    <li>{{descr}}</li>
  {{/each}}
</ul>
  1. Set up CSS for the chart:
    .data-list {
    background-color: yellow;
    }
  2. Observe your CSS code in the preview as a plain text.

Screenshots/recordings

image

Superset version

master / latest-dev

Python version

3.11

Node version

18 or greater

Browser

Chrome

Additional context

While investigating this issue, I dived into Handlebars code. There is a section in superset-frontend/plugins/plugin-chart-handlebars/src/components/Handlebars/HandlebarsViewer.tsx at lines 35-43 that seemingly retrieves HTML sanitization and schema overrides. I went to the browser console and printed these variables with this snippet:

const appContainer = document.getElementById('app');
const { common } = JSON.parse(
    appContainer?.getAttribute('data-bootstrap') || '{}',
);
const htmlSanitization = common?.conf?.HTML_SANITIZATION;
const htmlSchemaOverrides =
    common?.conf?.HTML_SANITIZATION_SCHEMA_EXTENSIONS;

console.log(htmlSanitization);
console.log(htmlSchemaOverrides);

The output returns true and empty object respectively despite the overrides are configured (see "How to reproduce the bug" section). It appears that either I'm doing something wrong, or the overrides are not read from the superset_config.py.

Checklist

dosubot[bot] commented 10 hours ago

I found a similar unsolved discussion that might be relevant to your issue: Use HANDLEBARS Superset Chart, which was last updated on 2024-08-12 [1].

The CSS code defined in the "CSS styles" section of Handlebars chart settings is not being applied to the HTML template and is instead shown as plain text on the chart because HTML sanitization needs to be configured to use CSS. This is indicated by the tooltip message in the StyleControl component:

<InfoTooltipWithTrigger
  iconsStyle={{ marginLeft: theme.gridUnit }}
  tooltip={t('You need to configure HTML sanitization to use CSS')}
/>

This suggests that without proper HTML sanitization configuration, the CSS will not be applied and will be rendered as plain text [2].

To ensure CSS is applied correctly, you may need to adjust the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration. For example, to allow the style attribute for div elements, you can configure it as follows:

HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
    "attributes": {
        "div": ["style"],
    },
}

Additionally, the superset-frontend package uses the xss library to handle HTML sanitization. The default whitelist in the xss filter includes attributes like style, class, and title for elements such as span, div, a, and img. However, CSS is disabled by default in the xss filter configuration:

const xssFilter = new FilterXSS({
  whiteList: {
    ...getDefaultWhiteList(),
    span: ['style', 'class', 'title'],
    div: ['style', 'class'],
    a: ['style', 'class', 'href', 'title', 'target'],
    img: ['style', 'class', 'src', 'alt', 'title', 'width', 'height'],
    video: [
      'autoplay',
      'controls',
      'loop',
      'preload',
      'src',
      'height',
      'width',
      'muted',
    ],
  },
  stripIgnoreTag: true,
  css: false,
});

To ensure CSS is applied correctly, you may need to adjust the css property in the xss filter configuration or extend the sanitization schema as mentioned above [3][4].

To continue talking to Dosu, mention @dosu.

goldjee commented 10 hours ago

@dosu Yep, I've seen that similar issue and tried the steps described there. HTML is parsed just fine in my case, it's just CSS malfunctioning.

I've accidentally posted the issue too early. Edited in some additional info in the "Additional context" section afterwards. It may be useful in this case. Despite HTML_SANITIZATION_SCHEMA_EXTENSIONS is configured, it seems that this parameter is not taken into account.

If I disable HTML sanitization completely, the CSS is not applied as well.

dosubot[bot] commented 10 hours ago

The issue seems to be that the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is not being correctly applied. Here are a few steps to troubleshoot and potentially resolve the issue:

  1. Verify Configuration Loading: Ensure that the superset_config.py file is correctly loaded and the configurations are being applied. You can add a print statement or log the configuration to verify this.

  2. Check Frontend Configuration: The frontend components need to receive the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration. Ensure that the configuration is passed correctly from the backend to the frontend. This can be done by checking the bootstrap_data payload in the render_app_template method in BaseSupersetView:

    class BaseSupersetView(BaseView):
       def render_app_template(
           self, extra_bootstrap_data: dict[str, Any] | None = None
       ) -> FlaskResponse:
           payload = {
               "user": bootstrap_user_data(g.user, include_perms=True),
               "common": common_bootstrap_payload(),
               **(extra_bootstrap_data or {}),
           }
           return self.render_template(
               "superset/spa.html",
               entry="spa",
               bootstrap_data=json.dumps(
                   payload, default=json.pessimistic_json_iso_dttm_ser
               ),
           )
  3. Extend Sanitization Schema: Ensure that the HTML_SANITIZATION_SCHEMA_EXTENSIONS is correctly extending the default schema. The configuration should look something like this:

    HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
       "attributes": {
           "*": ["style", "class"],
       },
       "tagNames": ["style"],
    }
  4. Check xss Library Configuration: The xss library configuration should allow the style attribute. Ensure that the css property is set to true if you want to allow inline CSS:

    const xssFilter = new FilterXSS({
     whiteList: {
       ...getDefaultWhiteList(),
       span: ['style', 'class', 'title'],
       div: ['style', 'class'],
       a: ['style', 'class', 'href', 'title', 'target'],
       img: ['style', 'class', 'src', 'alt', 'title', 'width', 'height'],
       video: [
         'autoplay',
         'controls',
         'loop',
         'preload',
         'src',
         'height',
         'width',
         'muted',
       ],
     },
     stripIgnoreTag: true,
     css: true,
    });
  5. Debugging: Use the browser console to check if the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is being correctly passed to the frontend. You can use the snippet you provided to log the configuration:

    const appContainer = document.getElementById('app');
    const { common } = JSON.parse(
       appContainer?.getAttribute('data-bootstrap') || '{}',
    );
    const htmlSanitization = common?.conf?.HTML_SANITIZATION;
    const htmlSchemaOverrides =
       common?.conf?.HTML_SANITIZATION_SCHEMA_EXTENSIONS;
    
    console.log(htmlSanitization);
    console.log(htmlSchemaOverrides);

By following these steps, you should be able to identify where the configuration is not being applied correctly and ensure that the CSS is properly rendered in the Handlebars chart.

To continue talking to Dosu, mention @dosu.

goldjee commented 9 hours ago

@dosu Can you provide an example of full superset_config.py with sanitization configured?

dosubot[bot] commented 9 hours ago

Certainly! Here is an example of a full superset_config.py with HTML sanitization configured:

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
"""The main config file for Superset

All configuration in this file can be overridden by providing a superset_config
in your PYTHONPATH as there is a ``from superset_config import *``
at the end of this file.
"""

# mypy: ignore-errors
# pylint: disable=too-many-lines
from __future__ import annotations

import imp  # pylint: disable=deprecated-module
import importlib.util
import json
import logging
import os
import re
import sys
from collections import OrderedDict
from contextlib import contextmanager
from datetime import timedelta
from email.mime.multipart import MIMEMultipart
from importlib.resources import files
from typing import Any, Callable, Iterator, Literal, TYPE_CHECKING, TypedDict

import click
import pkg_resources
from celery.schedules import crontab
from flask import Blueprint
from flask_appbuilder.security.manager import AUTH_DB
from flask_caching.backends.base import BaseCache
from pandas import Series
from pandas._libs.parsers import STR_NA_VALUES
from sqlalchemy.engine.url import URL
from sqlalchemy.orm.query import Query

from superset.advanced_data_type.plugins.internet_address import internet_address
from superset.advanced_data_type.plugins.internet_port import internet_port
from superset.advanced_data_type.types import AdvancedDataType
from superset.constants import CHANGE_ME_SECRET_KEY
from superset.jinja_context import BaseTemplateProcessor
from superset.key_value.types import JsonKeyValueCodec
from superset.stats_logger import DummyStatsLogger
from superset.superset_typing import CacheConfig
from superset.tasks.types import ExecutorType
from superset.utils import core as utils
from superset.utils.core import is_test, NO_TIME_RANGE, parse_boolean_string
from superset.utils.encrypt import SQLAlchemyUtilsAdapter
from superset.utils.log import DBEventLogger
from superset.utils.logging_configurator import DefaultLoggingConfigurator

logger = logging.getLogger(__name__)

if TYPE_CHECKING:
    from flask_appbuilder.security.sqla import models

    from superset.connectors.sqla.models import SqlaTable
    from superset.models.core import Database
    from superset.models.dashboard import Dashboard
    from superset.models.slice import Slice

# Realtime stats logger, a StatsD implementation exists
STATS_LOGGER = DummyStatsLogger()

# By default will log events to the metadata database with `DBEventLogger`
# Note that you can use `StdOutEventLogger` for debugging
# Note that you can write your own event logger by extending `AbstractEventLogger`
# https://github.com/apache/superset/blob/master/superset/utils/log.py
EVENT_LOGGER = DBEventLogger()

SUPERSET_LOG_VIEW = True

BASE_DIR = pkg_resources.resource_filename("superset", "")
if "SUPERSET_HOME" in os.environ:
    DATA_DIR = os.environ["SUPERSET_HOME"]
else:
    DATA_DIR = os.path.expanduser("~/.superset")

# ---------------------------------------------------------
# Superset specific config
# ---------------------------------------------------------
VERSION_INFO_FILE = str(files("superset") / "static/version_info.json")
PACKAGE_JSON_FILE = str(files("superset") / "static/assets/package.json")
# ---------------------------------------------------
# Image and file configuration
# ---------------------------------------------------
# The file upload folder, when using models with files
UPLOAD_FOLDER = BASE_DIR + "/app/static/uploads/"
UPLOAD_CHUNK_SIZE = 4096

# The image upload folder, when using models with images
IMG_UPLOAD_FOLDER = BASE_DIR + "/app/static/uploads/"

# The image upload url, when using models with images
IMG_UPLOAD_URL = "/static/uploads/"
# Setup image size default is (300, 200, True)
# IMG_SIZE = (300, 200, True)

# Default cache timeout, applies to all cache backends unless specifically overridden in
# each cache config.
CACHE_DEFAULT_TIMEOUT = int(timedelta(days=1).total_seconds())

# Default cache for Superset objects
CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for datasource metadata and query results
DATA_CACHE_CONFIG: CacheConfig = {"CACHE_TYPE": "NullCache"}

# Cache for dashboard filter state. `CACHE_TYPE` defaults to `SupersetMetastoreCache`
# that stores the values in the key-value table in the Superset metastore, as it's
# required for Superset to operate correctly, but can be replaced by any
# `Flask-Caching` backend.
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
    "CACHE_TYPE": "SupersetMetastoreCache",
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()),
    # Should the timeout be reset when retrieving a cached value?
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
    # The following parameter only applies to `MetastoreCache`:
    # How should entries be serialized/deserialized?
    "CODEC": JsonKeyValueCodec(),
}

# Cache for explore form data state. `CACHE_TYPE` defaults to `SupersetMetastoreCache`
# that stores the values in the key-value table in the Superset metastore, as it's
# required for Superset to operate correctly, but can be replaced by any
# `Flask-Caching` backend.
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
    "CACHE_TYPE": "SupersetMetastoreCache",
    "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
    # Should the timeout be reset when retrieving a cached value?
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
    # The following parameter only applies to `MetastoreCache`:
    # How should entries be serialized/deserialized?
    "CODEC": JsonKeyValueCodec(),
}

# store cache keys by datasource UID (via CacheKey) for custom processing/invalidation
STORE_CACHE_KEYS_IN_METADATA_DB = False

# CORS Options
ENABLE_CORS = False
CORS_OPTIONS: dict[Any, Any] = {}

# Sanitizes the HTML content used in markdowns to allow its rendering in a safe manner.
# Disabling this option is not recommended for security reasons. If you wish to allow
# valid safe elements that are not included in the default sanitization schema, use the
# HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration.
HTML_SANITIZATION = True

# Use this configuration to extend the HTML sanitization schema.
# By default we use the GitHub schema defined in
# https://github.com/syntax-tree/hast-util-sanitize/blob/main/lib/schema.js
# For example, the following configuration would allow the rendering of the
# style attribute for div elements and the ftp protocol in hrefs:
# HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
#   "attributes": {
#     "div": ["style"],
#   },
#   "protocols": {
#     "href": ["ftp"],
#   }
# }
# Be careful when extending the default schema to avoid XSS attacks.
HTML_SANITIZATION_SCHEMA_EXTENSIONS: dict[str, Any] = {
    "attributes": {
        "*": ["style", "className"],
    },
    "tagNames": ["style"],
}

# Chrome allows up to 6 open connections per domain at a time. When there are more
# than 6 slices in dashboard, a lot of time fetch requests are queued up and wait for
# next available socket. PR #5039 is trying to allow domain sharding for Superset,
# and this feature will be enabled by configuration only (by default Superset
# doesn't allow cross-domain request).
SUPERSET_WEBSERVER_DOMAINS = None
def _try_json_readversion(filepath: str) -> str | None:
    try:
        with open(filepath) as f:
            return json.load(f).get("version")
    except Exception:  # pylint: disable=broad-except
        return None
def _try_json_readsha(filepath: str, length: int) -> str | None:
    try:
        with open(filepath) as f:
            return json.load(f).get("GIT_SHA")[:length]
    except Exception:  # pylint: disable=broad-except
        return None

#
# If True, we will skip the call to load the logger config found in alembic.init
#
ALEMBIC_SKIP_LOG_CONFIG = False

# Depending on the context in which this config is loaded, the
# version_info.json file may or may not be available, as it is
# generated on install via setup.py. In the event that we're
# actually running Superset, we will have already installed,
# therefore it WILL exist. When unit tests are running, however,
# it WILL NOT exist, so we fall back to reading package.json
VERSION_STRING = _try_json_readversion(VERSION_INFO_FILE) or _try_json_readversion(
    PACKAGE_JSON_FILE
)

VERSION_SHA_LENGTH = 8
VERSION_SHA = _try_json_readsha(VERSION_INFO_FILE, VERSION_SHA_LENGTH)

# Build number is shown in the About section if available. This
# can be replaced at build time to expose build information.
BUILD_NUMBER = None

# default viz used in chart explorer & SQL Lab explore
DEFAULT_VIZ_TYPE = "table"

# default row limit when requesting chart data
ROW_LIMIT = 50000
# default row limit when requesting samples from datasource in explore view
SAMPLES_ROW_LIMIT = 1000
# default row limit for native filters
NATIVE_FILTER_DEFAULT_ROW_LIMIT = 1000
# max rows retrieved by filter select auto complete
FILTER_SELECT_ROW_LIMIT = 10000
# default time filter in explore
# values may be "Last day", "Last week", "<ISO date> : now", etc.
DEFAULT_TIME_FILTER = NO_TIME_RANGE

# This is an important setting, and should be lower than your
# [load balancer / proxy / envoy / kong / ...] timeout settings.
# You should also make sure to configure your WSGI server
# (gunicorn, nginx, apache, ...) timeout setting to be <= to this setting
SUPERSET_WEBSERVER_TIMEOUT = int(timedelta(minutes=1).total_seconds())

This configuration includes the HTML_SANITIZATION and HTML_SANITIZATION_SCHEMA_EXTENSIONS settings to allow CSS styles and class names in the HTML content [1].

To continue talking to Dosu, mention @dosu.

goldjee commented 9 hours ago

@dosu Oh I see, HTML_SANITIZATION is not a feature flag, but a top level variable. I've fixed that in my config and double checked HTML_SANITIZATION_SCHEMA_EXTENSIONS, but the issue is still there.

Where in the project codebase can I find the bridge that passes HTML_SANITIZATION_SCHEMA_EXTENSIONS from the Python backend to the frontend? I'd like to try to check if the parameter is actually passed.

dosubot[bot] commented 9 hours ago

The HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is defined in the superset/config.py file and is used to extend the HTML sanitization schema for rendering HTML content safely [1].

To pass this configuration from the Python backend to the frontend, it needs to be included in the bootstrap_data payload. Here are the key steps and locations in the codebase where this happens:

  1. Backend Configuration: Ensure that HTML_SANITIZATION_SCHEMA_EXTENSIONS is correctly defined in superset/config.py:

    HTML_SANITIZATION_SCHEMA_EXTENSIONS: dict[str, Any] = {
       "attributes": {
           "*": ["style", "className"],
       },
       "tagNames": ["style"],
    }
  2. Common Bootstrap Payload: The cached_common_bootstrap_data function prepares the common data sent to the client. This function should include the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration:

    def cached_common_bootstrap_data(user_id: int | None, locale: Locale | None) -> dict[str, Any]:
       frontend_config = {
           k: (list(conf.get(k)) if isinstance(conf.get(k), set) else conf.get(k))
           for k in FRONTEND_CONF_KEYS
       }
       # Add HTML_SANITIZATION_SCHEMA_EXTENSIONS to the frontend config
       frontend_config["HTML_SANITIZATION_SCHEMA_EXTENSIONS"] = conf.get("HTML_SANITIZATION_SCHEMA_EXTENSIONS", {})
    
       bootstrap_data = {
           "conf": frontend_config,
           # other configurations...
       }
       return bootstrap_data
    }
  3. Rendering the Template: The render_app_template method in the BaseSupersetView class constructs the bootstrap_data payload and passes it to the frontend:

    class BaseSupersetView(BaseView):
       def render_app_template(
           self, extra_bootstrap_data: dict[str, Any] | None = None
       ) -> FlaskResponse:
           payload = {
               "user": bootstrap_user_data(g.user, include_perms=True),
               "common": common_bootstrap_payload(),
               **(extra_bootstrap_data or {}),
           }
           return self.render_template(
               "superset/spa.html",
               entry="spa",
               bootstrap_data=json.dumps(
                   payload, default=json.pessimistic_json_iso_dttm_ser
               ),
           )
    }
  4. Frontend Usage: The HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is utilized in the frontend components, such as the Markdown component in superset-frontend/src/dashboard/components/gridComponents/Markdown.jsx:

    function mapStateToProps(state) {
     return {
       htmlSchemaOverrides: state.common.conf.HTML_SANITIZATION_SCHEMA_EXTENSIONS,
     };
    }

    And in the HandlebarsViewer component in superset-frontend/plugins/plugin-chart-handlebars/src/components/Handlebars/HandlebarsViewer.tsx:

    const appContainer = document.getElementById('app');
    const { common } = JSON.parse(
     appContainer?.getAttribute('data-bootstrap') || '{}',
    );
    const htmlSchemaOverrides = common?.conf?.HTML_SANITIZATION_SCHEMA_EXTENSIONS || {};

By ensuring that the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is included in the cached_common_bootstrap_data function and correctly passed through the bootstrap_data payload, you can verify if the parameter is actually passed to the frontend [2][3][4].

To continue talking to Dosu, mention @dosu.

goldjee commented 9 hours ago

@dosu Which file in the Superset codebase defines Common Bootstrap Payload? Could you provide the path to it?

dosubot[bot] commented 9 hours ago

The file in the Superset codebase that defines the Common Bootstrap Payload is superset/views/base.py.

The path to it is:

superset/views/base.py

Additionally, the common_bootstrap_payload function is also referenced in superset/embedded/view.py [1].

To continue talking to Dosu, mention @dosu.

goldjee commented 8 hours ago

@dosu There was a mistake on my side: I was passing wrong config file while starting Superset. Fixed that. Now I can see that HTML_SANITIZATION_SCHEMA_EXTENSIONS is sent to the frontend correctly.

Anyway, when I pass the config file with

HTML_SANITIZATION = True
HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
    "attributes": {
    "*": ["style","className"],
    },
    "tagNames": ["style"],
}

the class attribute is blank for processed HTML template.

My chart config is as follows. HTML:

<ul class="data-list">
  {{#each data}}
    <li>{{descr}}</li>
  {{/each}}
</ul>

CSS:

.data-list {
    background-color: yellow;
    list-style: none;
}

Output HTML in the chart is:

<ul class="">
    <li>Quisque vehicula pharetra velit, vel pharetra erat pellentesque a. Maecenas molestie maximus sem, ac dapibus.</li>
    <li>Aliquam posuere malesuada pharetra. Pellentesque id vulputate elit. Proin tristique fringilla odio vel vehicula. Cras.</li>
</ul>

If I disable sanitization, the CSS is applied properly. It points out that my HTML_SANITIZATION_SCHEMA_EXTENSIONS is probably missing something.

dosubot[bot] commented 8 hours ago

The issue you're encountering with the class attribute being blank is likely due to the fact that the xss library, which is used for HTML sanitization in Superset, does not recognize the className attribute. Instead, it uses the class attribute.

To resolve this, you should update your HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration to use the class attribute instead of className. Here is the corrected configuration:

HTML_SANITIZATION = True
HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
    "attributes": {
        "*": ["style", "class"],
    },
    "tagNames": ["style"],
}

With this configuration, the class attribute should be preserved in the processed HTML template, allowing your CSS to be applied correctly.

Make sure to restart Superset after making these changes to ensure the new configuration is loaded.

To continue talking to Dosu, mention @dosu.

goldjee commented 8 hours ago

@dosu I've included class to allowed attribute list and restarted Superset. Updated config looks like this:

HTML_SANITIZATION = True
HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
    "attributes": {
    "*": ["style","className","class"],
    },
    "tagNames": ["style"],
}

The class attribute in the output HTML is still empty.

dosubot[bot] commented 8 hours ago

The class attribute is still empty in the output HTML because the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is not set correctly. The className attribute is not necessary and might be causing issues. You only need to include the class attribute.

Here is the corrected configuration:

HTML_SANITIZATION = True
HTML_SANITIZATION_SCHEMA_EXTENSIONS = {
    "attributes": {
        "*": ["style", "class"],
    },
    "tagNames": ["style"],
}

Make sure this configuration is included in your superset_config.py file and restart Superset to apply the changes [1].

To continue talking to Dosu, mention @dosu.

goldjee commented 8 hours ago

@dosu className was there because it was present in other issues and in official article regarding Handlebars security in Superset here. Anyway, I've tried excluding className and this trick didn't work.

dosubot[bot] commented 8 hours ago

The class attribute should be preserved if the HTML_SANITIZATION_SCHEMA_EXTENSIONS configuration is set correctly. Given that the class attribute is still empty, it might be due to how the xss library is configured and used in the frontend.

Ensure that the xss library configuration in the frontend allows the class attribute. Here is the relevant code snippet from superset-frontend/packages/superset-ui-core/src/utils/html.tsx:

import { FilterXSS, getDefaultWhiteList } from 'xss';

const xssFilter = new FilterXSS({
  whiteList: {
    ...getDefaultWhiteList(),
    span: ['style', 'class', 'title'],
    div: ['style', 'class'],
    a: ['style', 'class', 'href', 'title', 'target'],
    img: ['style', 'class', 'src', 'alt', 'title', 'width', 'height'],
    video: [
      'autoplay',
      'controls',
      'loop',
      'preload',
      'src',
      'height',
      'width',
      'muted',
    ],
  },
  stripIgnoreTag: true,
  css: false,
});

export function sanitizeHtml(htmlString: string) {
  return xssFilter.process(htmlString);
}

Ensure that the class attribute is included in the whitelist for the relevant HTML tags. If the xss library configuration is correct, the class attribute should be preserved in the sanitized HTML.

Additionally, verify that the sanitizeHtml function is being used correctly in the frontend components where the Handlebars chart is rendered. If the configuration and usage are correct, the class attribute should be preserved.

If the issue persists, you might want to check if there are any other sanitization steps or overrides in the frontend code that might be affecting the class attribute.

To continue talking to Dosu, mention @dosu.

goldjee commented 8 hours ago

@dosu Should xss library be configured somewhere in config files? I've checked superset-frontend/packages/superset-ui-core/src/utils/html.tsx and its contents match with the snippet you have provided. class attribute is there in the whitelist.

dosubot[bot] commented 8 hours ago

The xss library configuration is not set in the config files but directly in the frontend code. Since the class attribute is already included in the whitelist in superset-frontend/packages/superset-ui-core/src/utils/html.tsx, the issue might be elsewhere.

Here are a few additional steps to troubleshoot:

  1. Verify sanitizeHtml Usage: Ensure that the sanitizeHtml function is being used correctly in the frontend components where the Handlebars chart is rendered. Specifically, check if the SafeMarkdown component is used with the correct sanitization options.

  2. Check SafeMarkdown Component: The SafeMarkdown component is responsible for rendering sanitized HTML. Verify that it is configured correctly in the Handlebars chart component:

    if (renderedTemplate) {
       return (
         <SafeMarkdown
           source={renderedTemplate}
           htmlSanitization={htmlSanitization}
           htmlSchemaOverrides={htmlSchemaOverrides}
         />
       );
    }
  3. Inspect Sanitization Overrides: Ensure that the htmlSanitization and htmlSchemaOverrides props are correctly passed to the SafeMarkdown component. These props should be derived from the Redux state:

    function mapStateToProps(state) {
     return {
       htmlSanitization: state.common.conf.HTML_SANITIZATION,
       htmlSchemaOverrides: state.common.conf.HTML_SANITIZATION_SCHEMA_EXTENSIONS,
     };
    }
  4. Debugging in Browser: Use the browser console to inspect the htmlSanitization and htmlSchemaOverrides values being passed to the SafeMarkdown component. Ensure they match the expected configuration.

  5. Check for Other Sanitization Steps: Verify if there are any other sanitization steps or overrides in the frontend code that might be affecting the class attribute. For example, check if the sanitizeHtml function is called multiple times or if there are additional sanitization layers.

By following these steps, you should be able to identify where the class attribute is being stripped and ensure that the configuration is applied correctly [1][2][3].

To continue talking to Dosu, mention @dosu.

goldjee commented 8 hours ago

Seems like at this moment I've hit the wall with the bot. I would really appreciate if the community could help me to proceed with this issue.