Dashboard export does not export `datasetUuid` in native filter targets but `datasetId` in 4.1

apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform

Apache License 2.0

61.98k stars 13.59k forks source link

Bug description

When exporting any dashboard in 4.0.x, (tested with 4.0.2) the native filter configuration would be exported as:

[..]
    name: Column desc
    filterType: filter_select
    targets:
    - column:
        name: column_name
      datasetUuid: 7ac464dc-315a-4c9e-b9ec-048d7bbcf382
[..]

When exporting any dashboard in Superset 4.1.0RC2 it is exported as:

    name: Column desc
    filterType: filter_select
    targets:
    - datasetId: 17
      column:
        name: column_name

This works fine when exporting and just importing to the same instance of an empty Superset but leads to a corruption of the filter settings when datasets are already existing since the wrong dataset (based on datasetId) will used for a filter.

I did some research and the main (only relevant) change introduced is: https://github.com/apache/superset/pull/26765 and it moved the code responsible for rewriting datasetId to datasetUuid away from the code that actually generates the content and in a path that is only responsible to find connected objects: https://github.com/apache/superset/blame/2e4f6d3f38404b70f8d0324743c229a4917acaed/superset/commands/dashboard/export.py#L182-L194

I think this would need to be reintroduced in the _file_content method. I already prepared a patch internally and testing it now. If it works, I will try to provide a PR as soon as possible since this is a blocker for me to use 4.1.0RC2.

Note that there is already some duplication of code by separating generating the content from exporting the related objects. So, maybe this should be done differently than simply duplicating more code.

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

Not applicable

Node version

Not applicable

Browser

Not applicable

Additional context

Tested on 4.1.0RC2

Checklist

[X] I have searched Superset docs and Slack and didn't find a solution to my problem.
[X] I have searched the GitHub issue tracker and didn't find a similar bug report.
[X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Hey @fmannhardt! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's work together to resolve the issue you're facing.

The issue you're encountering with the export functionality in Superset 4.1.0RC2, where datasetId is used instead of datasetUuid in native filter targets, is due to changes introduced in pull request #26765. This PR moved the logic for rewriting datasetId to datasetUuid away from the content generation path, leading to potential filter corruption when importing dashboards into instances with existing datasets.

The relevant code snippet that performs this replacement during the export process is:

# Extract all native filter datasets and replace native
# filter dataset references with uuid
for native_filter in payload.get("metadata", {}).get(
    "native_filter_configuration", []
):
    for target in native_filter.get("targets", []):
        dataset_id = target.pop("datasetId", None)
        if dataset_id is not None:
            dataset = DatasetDAO.find_by_id(dataset_id)
            if dataset:
                target["datasetUuid"] = str(dataset.uuid)
                if export_related:
                    yield from ExportDatasetsCommand([dataset_id]).run()

During the import process, the update_id_refs function converts datasetUuid back to datasetId:

# fix native filter references
native_filter_configuration = fixed.get("metadata", {}).get(
    "native_filter_configuration", []
)
for native_filter in native_filter_configuration:
    targets = native_filter.get("targets", [])
    for target in targets:
        dataset_uuid = target.pop("datasetUuid", None)
        if dataset_uuid:
            target["datasetId"] = dataset_info[dataset_uuid]["datasource_id"]

Reintroducing the logic for rewriting datasetId to datasetUuid in the _file_content method, as you suggested, should resolve the issue. This approach ensures that the dataset references remain consistent and unique across different instances, preventing filter corruption ^[1]^[2].

To continue talking to Dosu, mention @dosu.

apache / superset