ddbnl / office365-audit-log-collector

Collect / retrieve Office365, AzureAD and DLP audit logs and output to PRTG, Azure Log Analytics Workspace, SQL, Graylog, Fluentd, and/or file output.
https://ddbnl.github.io/office365-audit-log-collector/
MIT License
105 stars 40 forks source link

Memory Leak? #51

Closed SysAdminSmith closed 6 months ago

SysAdminSmith commented 11 months ago

Good morning and thank you so much for putting this program together!

I run office365-audit-log-collector in a LXC. Generally, I give it about 1GB RAM. However, it quickly hangs and locks up the container pegging at 1GB. So I increase it and it pegs the RAM, again. Raise it again and, same result. So it doesn't seem resources are the issue as the log-collector will take whatever you give it and lock up the container.

Is there anything I can do to prevent this activity?

Here is my config, in relevant part, thank you!:

log:  # Log settings. Debug will severely decrease performance                                                                                               
  path: '/var/log/officecollector/collector.log'                                                                                                                              
  debug: False                                                                                                                                                                               
collect:  # Settings determining which audit logs to collect and how to do it                                                               
  contentTypes:                                                                                                                                                                  
    Audit.General: True                                                                                                                                     
    Audit.AzureActiveDirectory: True                                                                                                                                 
    Audit.Exchange: True                                                                                                                                               
    Audit.SharePoint: True                                                                                                                                                              
    DLP.All: True                                                                                                                                                                       
  rustEngine: True  # Use False to revert to the old Python engine. If running from python instead of executable, make sure to install the python wheel in the RustEngineWheels folder                                                          
#  schedule: 0 0 10  # How often to run in days/hours/minutes. Delete this line to just run once and exit.                  
  maxThreads: 50  # Maximum number of simultaneous threads retrieving logs                                                                      
  retries: 3  # Times to retry retrieving a content blob if it fails                                                                                   
  retryCooldown: 30  # Seconds to wait before retrying retrieving a content blob                                                        
  autoSubscribe: True  # Automatically subscribe to collected content types. Never unsubscribes from anything.   
  skipKnownLogs: True  # Remember retrieved log ID's, don't collect them twice                                                                     
  resume: False  # Remember last run time, resume collecting from there next run                                                     
  hoursToCollect: 3  #Look back this many hours for audit logs (can be overwritten by resume)                                                  
filter:  # Only logs that match ALL filters for a content type are collected. Leave empty to collect all                          
  Audit.General:                                                                                                                                          
  Audit.AzureActiveDirectory:                                                                                                                                         
  Audit.Exchange:                                                                                                                                                        
  Audit.SharePoint:                                                                                                                                                                        
  DLP.All:

I run via crontab: */10 * * * * /root/officeauditlogcollector/officecollector.sh

ddbnl commented 6 months ago

Sorry for the late reply, due to my day job I was unable to work on the repo for a while.

Are you outputting the logs anywhere or just trying to run the collector to test?

Some of the output modules have a cache size config option, that will determine how many logs to batch in memory before offloading to the output. Without any outputs it will just keep collecting. I think the caching config option must be moved to the general settings, so it applies in all cases.

I will update here when the new version is available.

ddbnl commented 6 months ago

A new version is available that should have fixed this issue (cache size is now a global parameter, therefore memory usage is always limited).

Note that to reduce the number of bugs and improve performance, the last version has been fully rewritten in Rust, and as a result there are some small breaking changes (command line args). Check the readme for the correct syntax. It is also recommended to run the tool using the container that has been made available. See the repo readme for instructions.

If instead you want to keep using the binary, a new version is available here: https://github.com/ddbnl/office365-audit-log-collector/releases/tag/v2.2