duosecurity / duo_log_sync

MIT License
37 stars 27 forks source link

Mintime errors #36

Open Voilett-Roze opened 1 year ago

Voilett-Roze commented 1 year ago

Hi Im receiving errors with the trustmonitor, auth and activity logs.

2023-04-22 20:59:55 WARNING DuoLogSync: Shutting down due to [trustmonitor producer: [Received 400 Invalid request parameters (mintime must be within the past 180 days)]] 2023-04-22 20:59:55 WARNING DuoLogSync: Shutting down due to [activity producer: [Received 400 Invalid request parameters ('mintime' must be a timestamp in milliseconds)]] 2023-04-22 20:59:54 WARNING DuoLogSync: Shutting down due to [auth producer: [Received 400 Invalid request parameters ('mintime' must be a timestamp in milliseconds)]]

Am receiving queued logs admin action and telephony. Was receiving logs successfully until i updated the the code to the recently updated ones. 2023-04-22 20:59:55 ERROR DuoLogSync: check that the duoclient ikey and skey in the config file are correct

Have tried converting past 180 days to milliseconds to set mintime, with no success.

giveen commented 1 year ago

Same here

MarkTripod-Duo commented 1 year ago

I have seen instances in the past where certain conditions can cause the translation of an offset value of 180 in the config.yml to exceed the maximum allowed by the API. Do you still see the errors if you set the offset to 179?

giveen commented 1 year ago

I have seen instances in the past where certain conditions can cause the translation of an offset value of 180 in the config.yml to exceed the maximum allowed by the API. Do you still see the errors if you set the offset to 179?

My offset is set to 1, because i only pull 1 day of logs in at a time.

It looks to be a change in the API

https://github.com/mbegan/Duo-PSModule/issues/46

MarkTripod-Duo commented 1 year ago

Nothing has changed in the API. The version 2 log endpoints require timestamps in milliseconds. This is a departure from the original version 1 log endpoints expecting a timestamp in seconds. The Duo Log Sync program takes this into account based upon the - endpoints: [] configuration in the YAML file.

giveen commented 1 year ago

Nothing has changed in the API. The version 2 log endpoints require timestamps in milliseconds. This is a departure from the original version 1 log endpoints expecting a timestamp in seconds. The Duo Log Sync program takes this into account based upon the - endpoints: [] configuration in the YAML file.

Please give an example on what to set, because its not working. Multiple people are having an issue after a recent change on Duo side, before that it was working correctly.

MarkTripod-Duo commented 1 year ago
version: "1.0.0"

dls_settings:
  log_filepath: "/tmp/duologsync.log"
  log_format: "JSON"

  api:
    offset: 179
    timeout: 180

  checkpointing:
    enabled: True
    directory: "/tmp"

servers:
  - id: "test"
    hostname: "127.0.0.1"
    port: 514
    protocol: "TCP"

account:
  ikey: "admin-api-ikey"
  skey: "admin-api-skey"
  hostname: "host.name.com"

  endpoint_server_mappings:
    - endpoints:
        ["auth", "telephony", "trustmonitor"]
      server: "test"
  is_msp: False

The configuration above (after having the account information updated) would collect the authentication logs. telephony logs and Trust Monitor logs from the Duo Admin API. On first run it would look back 179 days and then create a checkpoint file in /tmp for tracking how far back to look for each successive polling of the API log endpoints.

giveen commented 1 year ago

Literally what I have, still same error. its interesting how you say "there was no change", but at the same time you say "well logs require a different timestamp now", thats a change.

version: '1.0.0'

dls_settings:
  log_filepath: '/var/log/duologsync/duologsync.log'
  log_format: 'JSON'

  api:
    offset: 1
    timeout: 120

servers:
  - id: 'Graylog'
    hostname: xx.xx.xx.xx
    port: 1514
    protocol: 'UDP'

account:
  ikey: 'xxxxxxxxxxxxxxxxx'
  skey: 'xxxxxxxxxxxxxxxxxxx'
  hostname: 'xxxxxxxxxxxxxxxx.duosecurity.com'

  endpoint_server_mappings:
    - endpoints: ['adminaction', 'auth', 'telephony', 'activity']
      server: 'Graylog'
  is_msp: False
MarkTripod-Duo commented 1 year ago

Each log endpoint has individual argument requirements that have never changed since they were published. My point was that DLS takes into account which arguments are required for each endpoint automatically.

What exactly is the error you receive when you run DLS with your configuration?

giveen commented 1 year ago

The exact error at the top of this issue.

JBHilke commented 1 year ago

We have exactly this same issue. I had installed DLS 2.0 in Sep. 2022, and just updated to the latest version of 2.2.0, but issue persists. I'm running on a Windows server 2012 with python 3.8, and use a scheduled task to restart the script every 30 min. in case it fails. The problem seems to be with the Trustmonitor endpoint. Eventually I end up with a checkpoint file that just contains the word "null". A temporary fix is to delete that file and let the program try again. But now that does not even seem to work. Even though the log file indicates that some logs are added to queue, these are not making it to my Graylog.

-------- Partial log results ---------- 2023-06-26 17:09:12 INFO Starting DuoLogSync 2023-06-26 17:09:12 INFO DuoLogSync: Opening connection to Graylogweb.xxxxxx.org:4519 with protocol TCP 2023-06-26 17:09:12 INFO duo_client Admin initialized for ikey: xxxxxxxxxx, host: api-xxxxxxx.duosecurity.com 2023-06-26 17:09:12 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\adminaction_checkpoint_data.txt 2023-06-26 17:09:12 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\auth_checkpoint_data.txt 2023-06-26 17:09:12 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\telephony_checkpoint_data.txt 2023-06-26 17:09:12 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\trustmonitor_checkpoint_data.txt 2023-06-26 17:09:12 INFO Could not read checkpoint file for trustmonitor logs, consuming logs from 2023-01-03T02:09:12+00:00 (UTC) 2023-06-26 17:09:12 INFO adminaction producer: fetching next logs after 150 seconds 2023-06-26 17:09:12 INFO adminaction consumer: waiting for logs 2023-06-26 17:09:12 INFO auth producer: fetching next logs after 150 seconds 2023-06-26 17:09:12 INFO auth consumer: waiting for logs 2023-06-26 17:09:12 INFO telephony producer: fetching next logs after 150 seconds 2023-06-26 17:09:12 INFO telephony consumer: waiting for logs 2023-06-26 17:09:12 INFO trustmonitor producer: fetching next logs after 150 seconds 2023-06-26 17:09:12 INFO trustmonitor consumer: waiting for logs 2023-06-26 17:11:43 INFO adminaction producer: fetching logs from offset 1687804234 2023-06-26 17:11:43 INFO auth producer: fetching logs from offset ['1687808500256', 'bb77e2b7-9088-43fc-bbc6-f2f08778e1e6'] 2023-06-26 17:11:43 INFO telephony producer: fetching logs from offset None 2023-06-26 17:11:43 INFO trustmonitor producer: fetching logs from offset 1672711752000 2023-06-26 17:11:43 INFO adminaction producer: no new logs available 2023-06-26 17:11:43 INFO adminaction producer: fetching next logs after 150 seconds 2023-06-26 17:11:43 ERROR DuoLogSync: check that the duoclient ikey and skey in the config file are correct 2023-06-26 17:11:43 WARNING DuoLogSync: Shutting down due to [telephony producer: [Received 400 Invalid request parameters ('mintime' must be a timestamp in milliseconds)]] 2023-06-26 17:11:43 INFO telephony producer: shutting down 2023-06-26 17:11:43 INFO telephony consumer: shutting down 2023-06-26 17:11:43 INFO auth producer: adding 60 logs to the queue 2023-06-26 17:11:43 INFO auth producer: successfully added logs to the queue 2023-06-26 17:11:43 INFO auth producer: shutting down 2023-06-26 17:11:43 INFO auth consumer: shutting down 2023-06-26 17:11:44 INFO trustmonitor producer: adding 41 logs to the queue 2023-06-26 17:11:44 INFO trustmonitor producer: successfully added logs to the queue 2023-06-26 17:11:44 INFO trustmonitor producer: shutting down 2023-06-26 17:11:44 INFO trustmonitor consumer: shutting down 2023-06-26 17:11:44 INFO adminaction producer: shutting down 2023-06-26 17:11:44 INFO adminaction consumer: shutting down

------------ Result when manually running from command line ----------- C:\python38\Scripts>duologsync c:\duo_log_sync\config.yml Configured logging to write to file C:\duo_log_sync\LOGS\duologsync.log. DuoLogSync: shutdown successfully. Check C:\duo_log_sync\LOGS\duologsync.log for program logs Exception ignored in: <function _ProactorBasePipeTransport.del at 0x000000E461322F70> Traceback (most recent call last): File "C:\python38\lib\asyncio\proactor_events.py", line 116, in del self.close() File "C:\python38\lib\asyncio\proactor_events.py", line 108, in close self._loop.call_soon(self._call_connection_lost, None) File "C:\python38\lib\asyncio\base_events.py", line 719, in call_soon self._check_closed() File "C:\python38\lib\asyncio\base_events.py", line 508, in _check_closed raise RuntimeError('Event loop is closed') RuntimeError: Event loop is closed

------------- Relevant settings in config.yml -----------------

version: '1.0.0'

dls_settings: log_filepath: 'C:\duo_log_sync\LOGS\duologsync.log' log_format: 'JSON'

api: offset: 175 timeout: 150

checkpointing: enabled: True directory: 'C:\duo_log_sync\LOGS'

servers:

account: ikey: 'xxxxxxxxxxxxxxxxx' skey: 'xxxxxxxxxxxxxxxxxxx' hostname: 'xxxxxxxxxxxxxxxx.duosecurity.com'

endpoint_server_mappings:

----------- Also, I could not get the upgrade script to work when moving to version 2.2.0 -------------------- C:\duo_log_sync>\python38\python upgrade_config.py C:\duo_log_sync\config_old1.yml c:\duo_log_sync\config.yml Traceback (most recent call last): File "upgrade_config.py", line 268, in main() File "upgrade_config.py", line 87, in main upgraded_config = upgrade_config(args.old_config_path) File "upgrade_config.py", line 105, in upgrade_config if not config.get('transport').get('certFileDir'): AttributeError: 'NoneType' object has no attribute 'get'

JBHilke commented 1 year ago

Update1: Today I tried removing all the checkpoint files and setting the offset value to "1" in the config.yml file. Since I had already retrieved log entries up through yesterday afternoon, a setting of 1 day would minimize duplicates in my logging system. Somehow the system seemed to like this combination of settings, and started retrieving data for all endpoints.

However, after a while of running, and I re-enabled my scheduled task, eventually it bombed out. But this time it was the telephony endpoint with an issue: Shutting down due to [telephony producer: [Received 400 Invalid request parameters ('maxtime' must be strictly greater than 'mintime')]]

I have copied a relevant portion of our logs below. I am wondering if an endpoint generates an issue when it previously has gathered some data, but then on a particular round it has no data to gather, then the next round it has an issue with an offset of "None" or "Null": telephony producer: fetching logs from offset None I'm testing this by removing the checkpoint file for the offending endpoint.

============= Log Excerpt from 6-27-23 ================ 2023-06-27 11:59:55 INFO Starting DuoLogSync 2023-06-27 11:59:55 INFO DuoLogSync: Opening connection to Graylogweb.sccourts.org:4519 with protocol TCP 2023-06-27 11:59:55 INFO duo_client Admin initialized for ikey: xxxxxxxxx, host: api-xxxxxxxx.duosecurity.com 2023-06-27 11:59:55 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\adminaction_checkpoint_data.txt 2023-06-27 11:59:55 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\auth_checkpoint_data.txt 2023-06-27 11:59:55 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\telephony_checkpoint_data.txt 2023-06-27 11:59:55 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\trustmonitor_checkpoint_data.txt 2023-06-27 11:59:55 INFO adminaction producer: fetching next logs after 150 seconds 2023-06-27 11:59:55 INFO adminaction consumer: waiting for logs 2023-06-27 11:59:55 INFO auth producer: fetching next logs after 150 seconds 2023-06-27 11:59:55 INFO auth consumer: waiting for logs 2023-06-27 11:59:55 INFO telephony producer: fetching next logs after 150 seconds 2023-06-27 11:59:55 INFO telephony consumer: waiting for logs 2023-06-27 11:59:55 INFO trustmonitor producer: fetching next logs after 150 seconds 2023-06-27 11:59:55 INFO trustmonitor consumer: waiting for logs 2023-06-27 12:02:26 INFO adminaction producer: fetching logs from offset 1687881409 2023-06-27 12:02:26 INFO auth producer: fetching logs from offset ['1687881297000', '35d76a28-2e5f-44c5-824c-164c760b5f23'] 2023-06-27 12:02:26 INFO telephony producer: fetching logs from offset None 2023-06-27 12:02:26 INFO trustmonitor producer: fetching logs from offset 1687828713646 2023-06-27 12:02:26 ERROR DuoLogSync: check that the duoclient ikey and skey in the config file are correct 2023-06-27 12:02:26 WARNING DuoLogSync: Shutting down due to [telephony producer: [Received 400 Invalid request parameters ('maxtime' must be strictly greater than 'mintime')]] 2023-06-27 12:02:26 INFO telephony producer: shutting down 2023-06-27 12:02:26 INFO telephony consumer: shutting down 2023-06-27 12:02:26 INFO adminaction producer: no new logs available 2023-06-27 12:02:26 INFO adminaction producer: shutting down 2023-06-27 12:02:26 INFO adminaction consumer: shutting down 2023-06-27 12:02:27 INFO auth producer: adding 7 logs to the queue 2023-06-27 12:02:27 INFO auth producer: successfully added logs to the queue 2023-06-27 12:02:27 INFO auth producer: shutting down 2023-06-27 12:02:27 INFO auth consumer: shutting down 2023-06-27 12:02:27 INFO trustmonitor producer: no new logs to add to the queue 2023-06-27 12:02:27 INFO trustmonitor producer: shutting down 2023-06-27 12:02:27 INFO trustmonitor consumer: shutting down 2023-06-27 12:15:01 INFO Starting DuoLogSync 2023-06-27 12:15:01 INFO DuoLogSync: Opening connection to Graylogweb.sccourts.org:4519 with protocol TCP 2023-06-27 12:15:01 INFO duo_client Admin initialized for ikey: xxxxxxx, host: api-xxxxxx.duosecurity.com 2023-06-27 12:15:01 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\adminaction_checkpoint_data.txt 2023-06-27 12:15:01 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\auth_checkpoint_data.txt 2023-06-27 12:15:01 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\telephony_checkpoint_data.txt 2023-06-27 12:15:01 INFO Recovering log offset from checkpoint file at C:\duo_log_sync\LOGS\trustmonitor_checkpoint_data.txt 2023-06-27 12:15:01 INFO adminaction producer: fetching next logs after 150 seconds 2023-06-27 12:15:01 INFO adminaction consumer: waiting for logs 2023-06-27 12:15:01 INFO auth producer: fetching next logs after 150 seconds 2023-06-27 12:15:01 INFO auth consumer: waiting for logs 2023-06-27 12:15:01 INFO telephony producer: fetching next logs after 150 seconds 2023-06-27 12:15:01 INFO telephony consumer: waiting for logs 2023-06-27 12:15:01 INFO trustmonitor producer: fetching next logs after 150 seconds 2023-06-27 12:15:01 INFO trustmonitor consumer: waiting for logs 2023-06-27 12:17:32 INFO adminaction producer: fetching logs from offset 1687881409 2023-06-27 12:17:32 INFO auth producer: fetching logs from offset ['1687881297000', '35d76a28-2e5f-44c5-824c-164c760b5f23'] 2023-06-27 12:17:32 INFO telephony producer: fetching logs from offset None 2023-06-27 12:17:32 INFO trustmonitor producer: fetching logs from offset 1687828713646 2023-06-27 12:17:32 ERROR DuoLogSync: check that the duoclient ikey and skey in the config file are correct 2023-06-27 12:17:32 WARNING DuoLogSync: Shutting down due to [telephony producer: [Received 400 Invalid request parameters ('maxtime' must be strictly greater than 'mintime')]] 2023-06-27 12:17:32 INFO telephony producer: shutting down 2023-06-27 12:17:32 INFO telephony consumer: shutting down 2023-06-27 12:17:32 INFO adminaction producer: no new logs available 2023-06-27 12:17:32 INFO adminaction producer: shutting down 2023-06-27 12:17:32 INFO adminaction consumer: shutting down 2023-06-27 12:17:32 INFO auth producer: adding 15 logs to the queue 2023-06-27 12:17:32 INFO auth producer: successfully added logs to the queue 2023-06-27 12:17:32 INFO auth producer: shutting down 2023-06-27 12:17:32 INFO auth consumer: shutting down 2023-06-27 12:17:32 INFO trustmonitor producer: no new logs to add to the queue 2023-06-27 12:17:32 INFO trustmonitor producer: shutting down 2023-06-27 12:17:32 INFO trustmonitor consumer: shutting down

MarkTripod-Duo commented 9 months ago

@JBHilke Are you still having the periodic shutdown issue with Duo Log Sync?

JBHilke commented 8 months ago

Sorry @MarkTripod-Duo I did not see your reply until recently. Yes, we are still having the periodic shutdown issues. I can send a fresh log if you like. For today, it shutdown 11 times between midnight and 4am, but seems to be OK after that. We have our Task that runs the script restart every 30 min anyway, so it catches up upon restart. So I haven't spent any time troubleshooting it since last summer.

SalC3 commented 7 months ago

Hello - just updated from 2.0 to 2.2 and I'm having the exact same issue as everyone else. I originally had offset: 180 and was relying on the checkpointing to pull only newer data in after the initial run when I set this up a year or so ago. Because of this issue after upgrading from 2.0 to 2.2, I changed to offset: 1 which did not help. I then deleted the 3 checkpoint files (adminaction_checkpoint_data.txt, auth_checkpoint_data.txt, and telephony_checkpoint_data.txt) and that fixed the issue for a time.

However after a few runs, I keep getting an error for the telephony logs and only deleting the checkpoint resolves it.

2024-02-15 16:17:02 WARNING DuoLogSync: Shutting down due to [telephony producer: [Received 400 Invalid request parameters ('maxtime' must be strictly greater than 'mintime')]]

Is any progress being made on this bug, or should I roll back to 2.0? It seems this issue has been around for nearly a year.

Thanks!

Simonhua9 commented 6 months ago

Hello @MarkTripod-Duo, I can reproduce such issue after every service restart. Only telephony log can have such issue.

Per the check, the checkpoint file for telephony will not be converted into local time, but using UTC time instead after a service restart. It will try to use local time again for checking the timestamp but it is saved at UTC time.

For example, at UTC -8, if I restart the service at 11am (UTC -8) now, then the checkpoint in telephony will become 7pm (UTC -8). It is supposed to change the UTC time to local time, but when a service restarts, telephony log timestamp in checkpoint file will not do that conversion and it will use UTC time as local time directly. auth and admin timestamp in checkpoint files can work well without such issues.

Do you know if the duo_log_sync script can be updated for telephony part to work well just as other checkpoint files?