icloud-photos-downloader / icloud_photos_downloader

A command-line tool to download photos from iCloud
MIT License
6.92k stars 556 forks source link

Support Healthchecks.io monitoring #933

Open stin7 opened 3 months ago

stin7 commented 3 months ago

Summary

Add a --healthchecks_url (similar to Borgmatic) param or a more generic ping_url_on_success param

Context

I use healthchecks for monitoring important processes. I would like to integrate icloudpd into that system. The simplest way would be a param that accepts a URL that icloudpd will ping on successful download.

Perhaps there are other ways to handle this as well.

AndreyNikiforov commented 3 months ago

What information are you using for monitoring and what actions are you taking from it?

We probably need some kind of "is it working" metric and alert on it. Derivative of that would be some reliability score.

There may be a need for velocity-type metric as a guide for optimizing and adjusting behavior, something like "age of downloaded bytes"...

Architecture-wise, I am leaning towards pull mechanism for metrics (icloudpd exposes metrics on http endpoint and monitoring/alerting service pulls data; like prometheus). Note that I am looking for icloudpd as a services that keeps my iCloud collection synchronized with local storage, not a batch script that I run periodically.

stin7 commented 3 months ago

What information are you using for monitoring and what actions are you taking from it?

If a process/service doesn't successfully ping, then I get an alert about the process/service from healthchecks to go figure out what happened and get it back to green. (Healthchecks Intro: https://healthchecks.io/docs/ )

So for this project, it would be good to know that for some reason (most likely need to reauth, but it could be anything) my icloud photos aren't being backed up anymore and I should get it back online.

AndreyNikiforov commented 3 months ago

If a process/service doesn't successfully ping, then I get an alert about the process/service from healthchecks to go figure out what happened and get it back to green. (Healthchecks Intro: https://healthchecks.io/docs/ )

The service is performing periodic iCloud checks. I assume that pinging icloudpd to check if it is still running would be of little value. We would probably need to know if [last] expected check was performed. There is also a distinction between reason why expectation was not met -- if password was needed but was not provided by user, then icloudpd was technically healthy.

So for this project, it would be good to know that for some reason (most likely need to reauth, but it could be anything) my icloud photos aren't being backed up anymore and I should get it back online.

Yes, if expected check was not performed, then user needs to be notified/alerted to correct the issue. Kinda watch dog. Should probably be implemented on monitoring/alerting side, so if service is not running at all, we still notify user.

Thanks for helping brainstorming the issue. I need to dig into healthcheck.io to learn more about ideas to come up with the solution for icloudpd

stin7 commented 3 months ago

Thanks. Just to clarify one thing, healthchecks.io acts as a "dead man's switch". On healthchecks you specify how long it should wait for a successful ping from a service before sending an alert to you.

So, the change on iCloudpd would be simple. At end of sync, run "curl 'user provided healthchecks url'"

AndreyNikiforov commented 3 months ago

You can use --notification-script parameter to record in heath service the need to enter password

stin7 commented 3 months ago

Thanks, I missed that option, that seems good for when icloudpd knows there is an issue so I'll set that up to curl the /fail endpoint on healthchecks

Perhaps there could be a new --success-script option to ping healthchecks to catch when the service goes down for any reason

isundaylee commented 2 months ago

stumbled on this issue while looking for a way to integrate this with prometheus to set up alerts for "last icloudpd update > x days ago". i think if we had --success-script, it would allow integrating into prometheus (by having the script write a node_exporter textfile to be picked up by prometheus), as well as other monitoring solutions.

i can also potentially take a look at implementing this if people think it's a reasonable approach.