Insight-Services-APAC / APAC-Capability-DAI-DbtFabricSparkNb

MIT License
8 stars 3 forks source link

Logging enhancements #50

Closed jp-vanheerden closed 3 months ago

jp-vanheerden commented 4 months ago

The following enhancements to the framework logging would be useful:

  1. ability to define the name of the lakehouse that the logs are saved to - eg LH_Logging instead of LH_Raw
  2. Store the name of the 'project' or master notebook that is running when creating a batch. Note that this is related to Issue #26 (master naming convention)
  3. When getting the batch number for the 'detailed' logging, this needs to assume that other master notebooks could be running at the same time and use master notebook name as well
  4. Datetime format should be in human readable format
grantkriegerai commented 4 months ago

Please also catch up with Charl as he has just completed Delta table logging #27.

cheinamann commented 3 months ago
  1. The logging lakehouse can be set in the profile "log_lakehouse: LOG LAKEHOUSE HERE"
  2. Both logging tables include the master notebook as a column.
  3. The logging code has been amended to include the master notebook when opening and closing batches
  4. Datetime is now in human readable format but in UTC time. This can have the timezone added when writing queries against the data.
grantkriegerai commented 3 months ago

Dev completed bug picked up reverted to back to Charl

cheinamann commented 3 months ago

This has been merged to dev

cheinamann commented 3 months ago

We decided to not update the column data type for start_time in the log tables. Unix timestamp type is the standard way of capturing datetime in delta tables. To convert the unix timestamp you can use the following code in select statements:

  1. in spark sql use - from_unixtime(start_time) as start_time
  2. in tsql use - DATEADD(SECOND, start_time, '1970-01-01') as start_time