Azure-Samples / modern-data-warehouse-dataops

DataOps for the Modern Data Warehouse on Microsoft Azure. https://aka.ms/mdw-dataops.
MIT License
570 stars 451 forks source link

Define configuration file structure #579

Closed ydaponte closed 1 year ago

ydaponte commented 1 year ago

Parent #578

Discuss and agree on a file format and structure that needs to hold the following configuration information:

  1. Data retention setting
  2. Data ingestion information (last ingested, last modified, ...)
  3. Security information for ACL automation
  4. [Stretch] - Security information for the columns

In addition the format of the file and where to store it needs to be agreed as well.

Success criteria:

ydaponte commented 1 year ago

Current proposal: json located on a config container with the following format:

{ "datalakeProperties":[ { "year":"2019", "month":"09", "filename":"yellow_tripdata_2019-09.parquet", "aclPermissions":[

            {
            "type":"read",
            "userOrGroup": "aadgr<project_name><deployment_id>"
            },
            {
            "type":"execute",
            "userOrGroup": "aadgr<project_name><deployment_id>"
            }
        ],
        "created":"",
        "lastUpdated": "",
        "retention": ""
    },
    {
        "year":"2019",
        "month":"10",
        "filename":"yellow_tripdata_2019-10.parquet",
        "aclPermissions":[

            {
            "type":"read",
            "userOrGroup": "aadgr<project_name><deployment_id>"
            },
            {
            "type":"execute",
            "userOrGroup": "aadgr<project_name><deployment_id>"
            }
        ],
        "created":"",
        "lastUpdated": "",
        "retention": ""
    }
]

}

ydaponte commented 1 year ago

Configuration file structure at the moment is as below. It can change as we progress on the hack:

{ "datalakeProperties":[ { "year":"2019", "month":"08", "filename":"yellow_tripdata_2019-08.parquet", "aclPermissions":[

            {
            "type":"read",
            "userOrGroup": "aadgrhddep1"
            },
            {
            "type":"execute",
            "userOrGroup": "aadgrhddep1"
            }
        ],
        "created":"2019-08-01T00:00:00.0000000Z",
        "lastUpdatedSourceSystem": "2019-08-01T00:00:00.0000000Z",
    "lastUpdatedDatalake": "",
    },
    {
        "year":"2019",
        "month":"09",
        "filename":"yellow_tripdata_2019-09.parquet",
        "aclPermissions":[

            {
            "type":"read",
            "userOrGroup": "aadgrhddep1"
            },
            {
            "type":"execute",
            "userOrGroup": "aadgrhddep1"
            }
        ],
        "created":"2019-09-01T00:00:00.0000000Z",
        "lastUpdatedSourceSystem": "2019-09-30T00:00:00.0000000Z",
    "lastUpdatedDatalake": "",
    },
    {
        "year":"2019",
        "month":"10",
        "filename":"yellow_tripdata_2019-10.parquet",
        "aclPermissions":[

            {
            "type":"read",
            "userOrGroup": "aadgrhddep1"
            },
            {
            "type":"execute",
            "userOrGroup": "aadgrhddep1"
            }
        ],
        "created":"2019-10-01T00:00:00.0000000Z",
        "lastUpdatedSourceSystem": "2019-10-30T00:00:00.0000000Z",
    "lastUpdatedDatalake": "2019-10-15T00:00:00.0000000Z",
    }
]

}