apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.98k stars 13.59k forks source link

superset cli export_datasource_schema does not exist #30363

Open ilsaloving opened 4 days ago

ilsaloving commented 4 days ago

Bug description

I am trying to figure out how to get data sources created in superset in an automated way. There does not appear to be a CLI command for adding a data source, but it does support importing definitions from yaml files.

According to this doc page, it should be possible to do superset export_datasource_schema to get a comprehensive list of yaml directives.

As of 4.0.2, this command does not exist:

superset@0ffc84ec25b5:/app$ superset --help
Loading custom config...
Loaded your LOCAL configuration at [/app/superset_config.py]
logging was configured successfully
2024-09-23 19:08:03,111:INFO:superset.utils.logging_configurator:logging was configured successfully
2024-09-23 19:08:03,119:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
/usr/local/lib/python3.10/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
  warnings.warn(
Usage: superset [OPTIONS] COMMAND [ARGS]...

  This is a management script for the Superset application.

Options:
  -e, --env-file FILE   Load environment variables from this file. python-
                        dotenv must be installed.
  -A, --app IMPORT      The Flask application or factory function to load, in
                        the form 'module:name'. Module can be a dotted import
                        or file path. Name is not required if it is 'app',
                        'application', 'create_app', or 'make_app', and can be
                        'name(args)' to pass arguments.
  --debug / --no-debug  Set debug mode.
  --version             Show the Flask version.
  --help                Show this message and exit.

Commands:
  compute-thumbnails              Compute thumbnails
  db                              Perform database migrations.
  export-dashboards               Export dashboards to ZIP file
  export-datasources              Export datasources to ZIP file
  fab                             FAB flask group commands
  import-dashboards               Import dashboards from ZIP file
  import-datasources              Import datasources from ZIP file
  import-directory                Imports configs from a given directory
  init                            Inits the Superset application
  legacy-export-dashboards        Export dashboards to JSON
  legacy-export-datasource-schema
                                  Export datasource YAML schema to stdout
  legacy-export-datasources       Export datasources to YAML
  legacy-import-dashboards        Import dashboards from JSON file
  legacy-import-datasources       Import datasources from YAML
  limiter                         Flask-Limiter maintenance & utility...
  load-examples                   Loads a set of Slices and Dashboards...
  load-test-users                 Loads admin, alpha, and gamma user for...
  migrate-viz                     Migrate a viz from one type to another.
  re-encrypt-secrets
  routes                          Show the routes for the app.
  run                             Run a development server.
  set-database-uri                Updates a database connection URI
  shell                           Run a shell in the app context.
  superset                        This is a management script for the...
  sync-tags                       Rebuilds special tags (owner, type,...
  test-db                         Run a series of tests against an...
  update-api-docs                 Regenerate the openapi.json file in docs
  version                         Prints the current version number

Additionally, if I try to do superset legacy-export-datasource-schema, I get the following error:

superset@0ffc84ec25b5:/app$ superset legacy-export-datasource-schema
Loading custom config...
Loaded your LOCAL configuration at [/app/superset_config.py]
logging was configured successfully
2024-09-23 19:09:09,331:INFO:superset.utils.logging_configurator:logging was configured successfully
2024-09-23 19:09:09,339:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
/usr/local/lib/python3.10/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
  warnings.warn(
Traceback (most recent call last):
  File "/usr/local/bin/superset", line 33, in <module>
    sys.exit(load_entry_point('apache-superset', 'console_scripts', 'superset')())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/flask/cli.py", line 357, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/app/superset/cli/importexport.py", line 404, in legacy_export_datasource_schema
    data = dict_import_export.export_schema_to_dict(back_references=back_references)
  File "/app/superset/utils/dict_import_export.py", line 31, in export_schema_to_dict
    Database.export_schema(recursive=True, include_parent_ref=back_references)
  File "/app/superset/models/helpers.py", line 240, in export_schema
    child_class.export_schema(
  File "/app/superset/models/helpers.py", line 238, in export_schema
    child_class = cls.__mapper__.relationships[column].argument.class_
AttributeError: type object 'SqlMetric' has no attribute 'class_'. Did you mean: '__class__'?

How to reproduce the bug

  1. Spin up a superset docker container
  2. Configure it as desired
  3. Try to obtain the yaml schema definition so I can construct a new yaml file

Screenshots/recordings

No response

Superset version

4.0.2

Python version

Not applicable

Node version

Not applicable

Browser

Not applicable

Additional context

2024-09-23 19:09:09,339:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> /usr/local/lib/python3.10/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend. warnings.warn( Traceback (most recent call last): File "/usr/local/bin/superset", line 33, in sys.exit(load_entry_point('apache-superset', 'console_scripts', 'superset')()) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, kwargs) File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, *kwargs) File "/usr/local/lib/python3.10/site-packages/flask/cli.py", line 357, in decorator return __ctx.invoke(f, args, kwargs) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke return callback(*args, **kwargs) File "/app/superset/cli/importexport.py", line 404, in legacy_export_datasource_schema data = dict_import_export.export_schema_to_dict(back_references=back_references) File "/app/superset/utils/dict_import_export.py", line 31, in export_schema_to_dict Database.export_schema(recursive=True, include_parent_ref=back_references) File "/app/superset/models/helpers.py", line 240, in export_schema child_class.export_schema( File "/app/superset/models/helpers.py", line 238, in export_schema child_class = cls.mapper.relationships[column].argument.class AttributeError: type object 'SqlMetric' has no attribute 'class'. Did you mean: 'class__'?

Checklist

dosubot[bot] commented 4 days ago

Hey @ilsaloving, I'm here to help you with any bugs, questions, or contributions you have regarding Superset. Let's tackle this together!

To obtain the YAML schema definition for data sources in Superset 4.0.2 using the CLI, you should use the export_datasource_schema command. Here is the command you need to run:

superset export_datasource_schema

This command will export the datasource YAML schema to stdout. If you want to include parent back references, you can add the -b flag:

superset export_datasource_schema -b

Regarding the AttributeError you encountered with the legacy-export-datasource-schema command, it seems to be related to a change in the export_schema_to_dict function in dict_import_export.py. The function was modified to call Database.export_schema with the parameters recursive=True and include_parent_ref=back_references. This change might cause an AttributeError if the Database class does not have an export_schema method.

Here is the relevant code snippet:

def export_schema_to_dict(back_references: bool) -> dict[str, Any]:
    """Exports the supported import/export schema to a dictionary"""
    databases = [
        Database.export_schema(recursive=True, include_parent_ref=back_references)
    ]
    data = {}
    if databases:
        data[DATABASES_KEY] = databases
    return data

If Database does not have an export_schema method, attempting to call it will result in an AttributeError [1][2].

For the SqlMetric class, the class_ attribute is not defined. The SqlMetric class includes attributes such as id, metric_name, verbose_name, metric_type, description, d3format, currency, warning_text, table_id, expression, and extra. It also includes relationships, methods, and properties to handle various functionalities related to SQL metrics [3][4].

To resolve the issue, you may need to ensure that the Database class has the export_schema method implemented or adjust the code to handle the absence of this method.

To continue talking to Dosu, mention @dosu.

rusackas commented 4 days ago

It looks like export_datasource_schema has been renamed to legacy_export_datasource_schema - I'm not sure if it's on the path to deprecation or if there's another reason for the rename. @betodealmeida might know.

You may also want to take a look at the Preset CLI (not maintained by the Superset project, but it is built to be compatible with Superset). https://preset.io/blog/version-control-superset-charts-dashboards-superset/

We probably need to update our docs to fix the export_datasource_schema reference, but I'd like to know more before making a change myself. PRs welcome, of course :D