data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
225 stars 81 forks source link

Simplify Classification Config and Add more customizablity into it #1261

Open TejasRGitHub opened 3 months ago

TejasRGitHub commented 3 months ago

Is your idea related to a problem? Please describe. With this issue https://github.com/data-dot-all/dataall/issues/1032 , custom confidentiality levels were introduced which map to the data.all pre-existing levels. Also, with the classification and auto approval https://github.com/data-dot-all/dataall/issues/1221 issue, another config auto_approval_for_confidentiality_level was added,

Apart from that, currently table metadata is shown to any one if the dataset is Unclassified. For datasets, which are Official and Secret, the table meta data is only shown to user who are owners or who have shares on the dataset. We want this to be configurable with a config in config.json .

This same argument can be extended to preview dataset and showing metrics to the user.

With all these configs additions , the config.json for dataset configs is getting messy, especially the ones which are tied to classification.

Describe the solution you'd like For adding configs, the confidentiality top level config in datasets could be introduced which can then be expanded like

confidentiality": {
    "dropdown_enabled": true,
    "levels": [
        {
            "name": "Secret",
            "auto_approval_for_confidentiality_level": false,
            "show_table_metadata": false,
            "preview_table_data": false
        },
       {
            "name": "Official",
            "auto_approval_for_confidentiality_level": false,
            "show_table_metadata": true,
            "preview_table_data": true
        },
        {
            "name": "Unclassified",
            "auto_approval_for_confidentiality_level": true,
            "show_table_metadata": true,
            "preview_table_data": true
        }
    ]
}

Thanks @dlpzx for the suggestion

dlpzx commented 2 months ago

The config.json in v2.6 is going to see some changes related to the whole work on datasets and dataset_sharing refactoring. I would not make it harder to users to upgrade and move this item to 2.7. It is a quick improvement :)

noah-paige commented 2 months ago

To also handle input validation on backend for auto-approval depending on confidentiality provided #1221