ctsit / redcapcustodian

Simplified, automated data management on REDCap systems
Other
12 stars 6 forks source link

Logging does not log the project nor the project instance #159

Closed pbchase closed 2 weeks ago

pbchase commented 1 month ago

Logging has no way to identify the project or the instance of the project. Without a project identifier, it's difficult to query all of a project's logging and impossible to guarantee that a query is not returning data from an unintended project. If two projects use the same script name, their logging records would be indistinguishable.

Without an instance identifier, a second instance of a project's scripts would be indistinguishable from the first.

We could address these issues by adding a project column and an instance column to the rcc_job_log data model, setting environment variables with standard names, reading those names from the env, and writing those values on each log record. Such code would need to behave gracefully if the values are absent setting NULL values as login records are written.

The standard names in the environment should be INSTANCE and PROJECT

pbchase commented 1 month ago

This would require changes in

We probably need to write

We'd call the new setters in init_etl. We'd call the getters in build_etl_job_log_df

pbchase commented 2 weeks ago

Addressed by PR #159