papertrail / papertrail-cli

Command-line client for Papertrail hosted syslog & app log management service
http://papertrailapp.com/
MIT License
426 stars 46 forks source link

When the papertrail firehose is verbose, `papertrail -f` silently drops some logs #93

Open topher200 opened 7 years ago

topher200 commented 7 years ago

Steps to reproduce:

  1. Have many log messages. We generate 10GB of messages per day
  2. Run papertrail -f
  3. Observe that there are occasional gaps in the messages of 1-2 seconds. For example, we'll see a message from 12:01:01, followed by a message from 12:01:03 (without any of the messages from 12:01:02).

I assume this is by design! I'm guessing that if there are a ton of messages, you didn't want to overwhelm the servers or delay the CLI with too much data.

Regardless, I'd like a realtime (or near realtime) firehose to parse. What is the best way to get that data? My ideas:

  1. Use the "archive to s3" function, but that forces a delay of 1-2 hours and is unusable for this project
  2. Manually "chunk" the data on my side, by requesting 5 minutes of data at a time (so at 12:05, I request the data for 12:00 til 12:05, etc)
  3. ...?

Is there any way to get papertrail -f to stop dropping messages? If not, how would you develop a realtime-ish system?

topher200 commented 7 years ago

I ended up going with option 2 - it's working well!

I have a cron job that runs every minute. That cron job kicks off a python script. The python script calls the papertrail CLI to get all the log messages from the previous minute and dumps them to file.

One annoyance is that the CLI doesn't format messages the same as Papertrail's s3 archiver. We need to convert the log messages to a matching format manually.

Here's the python script: https://github.com/topher200/assertion-context/blob/42db16a17f2d0bb3f714e316663fd319f9a1373f/web/realtime_updater/run.py and the cron job: https://github.com/topher200/assertion-context/blob/42db16a17f2d0bb3f714e316663fd319f9a1373f/web/realtime_updater/crontab

jareware commented 5 years ago

Also seeing this behaviour. Somewhat inconvenient.

Thanks for your suggestions @topher200, sounds like option 2 is the way to go for us as well. 👍

ziemkowski commented 3 years ago

We're also experiencing this issue, but manually chunking it isn't viable for tailing with the amount of logs we have (not to mention it delays our response time).

Here's a snippet of the only line numbers output by papertrail-cli for a 15 frame stack trace:

PHP   4. ...
PHP  11. ...
PHP  12. ...

And then a minute later, another trace has massive gaps too:

PHP Stack trace:
PHP   2. ...
PHP   4. ...
PHP   7. ...
PHP   8. ...
PHP  15. ...

The stack traces suggest that other single-line logs are being completely missed as lines get dropped everywhere.

This issue has increased pressure on the devops team to move our logs to native AWS services 😞