ddbnl / office365-audit-log-collector

Collect / retrieve Office365, AzureAD and DLP audit logs and output to PRTG, Azure Log Analytics Workspace, SQL, Graylog, Fluentd, and/or file output.
https://ddbnl.github.io/office365-audit-log-collector/
MIT License
105 stars 40 forks source link

Fix resume #20

Closed owentl closed 2 years ago

owentl commented 2 years ago

Use the correct time when persisting the time of the last request. Previously start_time was used which will always pull from the old time. If there are no errors we want to use the current run time not the original start time (provided or persisted to disk)

ddbnl commented 2 years ago

You're right @owentl, thanks for opening the pull request.

I happened to rewrite the function that sets the last run time today (to support >24 hour collection time spans) and it presented a nice opportunity to fix this. It now sets the last run time to the end time used in the collection URLs. The next time it runs it will start collecting where the last run stopped. This should've been the case earlier but it wasn't as you noted.

In this case I won't integrate the pull request because it should be fixed in this commit. Could you take a second look for me and let me know if you agree? Thanks! Relevant code below:

def _get_all_available_content(self):

    end_time = datetime.datetime.now(datetime.timezone.utc)

    for content_type in self._remaining_content_types.copy():
        [...]
        self._last_run_times[content_type] = end_time.strftime("%Y-%m-%dT%H:%M:%SZ")`
ddbnl commented 2 years ago

Closing this PR as it should be fixed. Starting work on integrating #21.