imperva / incapsula-logs-downloader

A Python script for downloading log files from Incapsula
MIT License
30 stars 35 forks source link

Delay in actual start time in log and when the log was received. #64

Closed browneyedleagh closed 7 months ago

browneyedleagh commented 11 months ago

I had an issue similar to this with an older version of this script. I recently downloaded the incapsula-logs-downloader-release-3.0.0-beta for Imperva Attack Analytics. The script is working but the start time in the log is no where close to the time we received the log. Example: I received a log at 10:12 am but the logs start=1697541196914 (6:13 am) which is a 4 hr difference. Is there any way to resolve this?

joeymoore commented 11 months ago

Sorry @browneyedleagh, I'm not sure that this is a downloader issue as much as how long the AA log takes to populate and or what the max time/size on the log is set to. I will as the PM and see if we can get some more details around this. Do you have a lot of event in the log or is it rather small?

G4fanhoto commented 11 months ago

I also have the same problem, I'm always 2 days late compared to the current log, and I'm running the new script

browneyedleagh commented 11 months ago

Thanks for response. The logs are about 800 bytes.Sent from my iPhoneOn Oct 17, 2023, at 10:44 PM, Joe Moore @.***> wrote: Sorry @browneyedleagh, I'm not sure that this is a downloader issue as much as how long the AA log takes to populate and or what the max time/size on the log is set to. I will as the PM and see if we can get some more details around this. Do you have a lot of event in the log or is it rather small?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

joeymoore commented 11 months ago

Thanks @browneyedleagh, I did reach out to PM and we are concerned if there is a 4 hour delay in the logs however they would like to confirm that this is not a timezone issue. The logs are in UTC and wondering if this offset is not accounted for. Please let me know and I will convey your message to PM and have them investigate if there is still an issue.

browneyedleagh commented 11 months ago

Ahhhhh… that would be our issue I believe!  Thank you for the clarification 11:15 AM, Joe Moore @.***> wrote: Thanks @browneyedleagh, I did reach out to PM and we are concerned if there is a 4 hour delay in the logs however they would like to confirm that this is not a timezone issue. The logs are in UTC and wondering if this offset is not accounted for. Please let me know and I will convey your message to PM and have them investigate if there is still an issue.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

mzairza commented 8 months ago

Ahhhhh… that would be our issue I believe!  Thank you for the clarification 11:15 AM, Joe Moore @.> wrote: Thanks @browneyedleagh, I did reach out to PM and we are concerned if there is a 4 hour delay in the logs however they would like to confirm that this is not a timezone issue. The logs are in UTC and wondering if this offset is not accounted for. Please let me know and I will convey your message to PM and have them investigate if there is still an issue. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.>

May I know how did you solve your issue? I have the same problem here. Log arrived at SIEM seems to have a 24 hours delay. Im using the latest log downloader script.

amirizzatmohdunzir commented 6 months ago

Dear @joeymoore and all, greetings.

Currently facing this issue too, we are receiving the logs only about after 12 hours than the actual start time. May i know if anyone managed to find the solution for this?

joeymoore commented 6 months ago

@amirizzatmohdunzir with certain high volume customers, we have seen that the downloading process is no working fast enough and to increase this, we make a small change on the LogsDownloader.py file in line 65. from pool = ThreadPool() to pool = ThreadPool(processes=16) The default amount of processes used is based on the processor and this "limits" the amount of downloader threads can be spoon off. You can increase this gradually from 8 to N or test with 16 and see if the time catches up.

Additionally, you can look where the logs are being saved on download and confirm that there are no logs in the dir. Log should come and go depending on your archiving configuration.

amirizzatmohdunzir commented 6 months ago

@joeymoore big thanks for the fast reply. Turns out the version that we are using now (2.2.0) does not have the line 65 mentioned in LogsDownloader.py file. We'll try to attempt in upgrading first to 2.4.0 and apply the change as needed.

Out of topic, are there any documentation that i can refer to on how to perform an upgrade?

joeymoore commented 6 months ago

Sorry @amirizzatmohdunzir I was under the impression that you were running the latest - albeit the versioning is not easy to track. If you clone the repo or pull the latest, you have the latest and the only doc difference is the configuration naming convention where we prepended everything with IMPERVA_ This being said, I'd recommend spinning up another instance with the new script - the will be duplicate SIEM entries until the new script catches up to the old and then keeping up will not be an issue.

Additionally, you can look at my github profile and find my email address to email me directly for help.

amirizzatmohdunzir commented 6 months ago

Yes @joeymoore that's what we opted for. Just build up a new one instead of upgrading. Thanks a lot for the insight!