Closed Tronde closed 4 years ago
Good Evening,
I've opened this issue one week ago. I would kindly ask if you already had the chance to review my report.
I would appreciate it, if you could find the time to look into it.
Thanks in advance for your help.
Regards,
Tronde
certspotter is multi-threaded so to get a meaningful strace you need to include the -f
option. Please re-run the strace with -f
to see if any of the other threads are making progress.
You can also pass the -verbose
option to certspotter to see what it is doing.
Hi, and thanks for your answer.
So at first I straced the existing process with the -f
option included. The related process structure shown by ps
is:
tronde@host:~$ ps -fp 20273 -p 20274 -p 20275
UID PID PPID C STIME TTY TIME CMD
root 20273 691 0 Oct06 ? 00:00:00 /usr/sbin/CRON -f
tronde 20274 20273 0 Oct06 ? 00:00:00 /bin/sh -c /home/trondeng/go/bin/certspotter
tronde 20275 20274 4 Oct06 ? 12:08:48 /home/trondeng/go/bin/certspotter
You could find the strace output at https://gist.github.com/Tronde/8ca1b99535cf4bdb86ee3d905cb0c559
Please see the comment there which shows how the information were gathered.
In the next step I'm going to kill the process and start certspotter with the -verbose
option. I'll come back to you when I have any results.
Thanks for providing the additional information. The strace shows that certspotter is reading from file descriptor 5, which is a socket to a CT log.
CT logs have unfortunately grown extremely large over the last two years which means monitoring can take a very long time. If you let certspotter keep running, it will catch up eventually and terminate.
Alternatively, if you delete ~/.certspotter
, then on the next run certspotter will start monitoring from the end of each log instead of from its previous position. If you keep running certspotter regularly, it should keep up. The downside is that you will miss certificates that were logged between the previous position and the new end of each log.
You may want to consider using the commercial offering of Cert Spotter instead. I made certspotter open source because I wanted it to be easy for individuals who aren't CT experts to monitor logs. Unfortunately, given the current scale of the CT ecosystem I'm not sure this is still a viable vision.
I followed your suggestion and deleted ~/.certspotter
. After recreating the watchlist
the next run did finish after a short amount of time. I'll look into it, to see if it keeps that way the next few days.
You also suggested to use the commercial offering, but how does that work different from the version on my host? I did not see any bottleneck on my machine. Data from the first CT log came in with 4-5 Mbps and the host was close to idle.
I would like to understand what's the difference to the commercial service that makes it work faster. Could you explain it to me, please?
Anyway thanks for this great project. I use it to monitor the certificate for my private domain and really appreciate what you did here.
I run certspotter once every 24 hours, and when I look at the runtime each run takes 7-21 hours, with around 14 hours being the median. There's still some room before the runtime becomes too long to complete. I don't know what the current bottleneck is. I'm guessing it could be sped significantly up by downloading each log in parallel.
I would agree that it could speed things up when downloading all CT logs in paralell. Or at least some of them up to a certain number of logs.
@AGWA Would that be possible? I that a change you could implement in an upcoming release?
I just submitted a PR that downloads the logs in parallel.
Note that I did not scrutinize the whole code base to ensure it is thread-safe; my change seems to work, but no guarantees.
I've implemented parallel log processing in 86785d89d7f91c935b426e351f1fe38544476ce3.
I don't think there's anything more to be done with this issue, so I'm closing it.
Hi @AGWA,
Thanks a lot for implementing the parallel log processing.
Do you already know when you are going to release version 0.10?
Summary
Certspotter is started via cron job but the job never comes to an end. When cron tries to start the next job as scheduled this fails because another instance of certspotter is already running and the lock file prevents the start of another instance.
When the running process is stopped (
kill PID
) and the lock file has been removed the problem reoccurs after the next time certspotter was started. So a process is started and cannot complete next process is prevented from starting.Environment
Current status
In this section I try to provide complete information about the current process status.
Crontab entry
Related processes
Open files
Strace of PID
Conclusion
Strace was attached to the PID for round about 1 Minute. It seems that the process got stuck in futex for any reason and therefor cannot complete.
My Request
I would like to ask you if you could assist in troubleshooting this issue to find out if this is a bug or just a configuration issue on my machine. I did not have this kind of issue running Ubuntu 18.04 as OS. I see this behavior the first time since I started certspotter on Buster.
If you need any additional information please don't hesitate to ask. I'll do my best helping to solve this issue.