Closed akbertram closed 2 years ago
Note for implementation as the audit endpoint is not documented in the API docs.
(Public) API endpoint: /resources/databases/{databaseId}/audit
which accepts a POST request with a JSON payload. Example payloads:
{"resourceFilter":null,"typeFilter":[],"startTime":1634805912268}
<- no filter.{"resourceFilter":null,"typeFilter":["FORM"],"startTime":1634805912268}
<- filter by events related to forms.{"resourceFilter":"ckpodmfkqgm4fmnd","typeFilter":[],"startTime":1634805912268}
<- filter by a resource (form or folder) within the database.The new queryAuditLog()
function is ready for testing from branch version-4.20. For example as follows:
To install the test version of the package:
library("remotes")
remotes::install_github("bedatadriven/activityinfo-R", ref = "version-4.20")
To query the audit log:
library("activityinfo")
database.id <- "abcde1234"
# find all events in which a record was deleted:
events <- queryAuditLog(database.id, typeFilter = "RECORD")
# by default, a maximum of 100 events is returned therefore we keep querying until there are no more events:
r <- events
while (isTRUE(attr(r, "moreEvents"))) {
r <- queryAuditLog(database.id, before = attr(r, "endTime"), typeFilter = "RECORD")
events <- rbind(events, r)
}
# filter deletion events:
events[events$deleted == TRUE,]
It may be useful to incorporate the pagination into the R function. This would allow you to query for all events within a specific time range. The full database log can include tens of thousands of events, especially if there have been imports, so you may not want everything.
My assumption, based on empirical tests, is that the API endpoint returns a maximum of 100 events and that you can only provide a start time which I interpreted as being the most recent time. In other words, the endpoint will return a maximum of 100 most recent events up to that start time.
Do you suggest to add an optional after
argument which can be used to pass a timestamp (earlier in time than the before
timestamp) and to let the function repeatedly query the endpoint until all events between after
and before
have been collected?
Yes, that's the idea.
I have updated the version-4.20 branch to implement the range functionality. You can now do something like:
events <- queryAuditLog("{databaseId}", before = as.Date("2021-10-20"), after = as.Date("2021-09-13"), typeFilter = "RECORD")
A few more details in the commit message: 5ead227e664d8b7ade79cacbd593e5637b84e4f8
It should be possible to query a date range of a database's audit log as a data.frame.
/cc @Ryo-N7