get10101 / itchysats

CFDs on Bitcoin.
https://itchysats.network
MIT License
61 stars 17 forks source link

Log `Failed to fetch attestation` as error #3272

Closed da-kami closed 1 year ago

da-kami commented 2 years ago

We only log it as error if the event timestamp is 10 minutes in the past to avoid false negative error reports when the oracle is slow to report the attestation.

The event_id is added to the log statement for more context.


I propose to merge this into master and then cherrypick this commit on a patch release on top of 0.7.0 to get it into production. Is there anything else we would like to see in a patch release? I remember we wanted to release https://github.com/itchysats/itchysats/pull/3244 ?

da-kami commented 2 years ago

I suspect we are gonna drown in alerts, as we will be emitting this error log 8000 times per hour.

Perhaps there is a gentler way of handling this. Something like an alert that is triggered if 1 or more unique attestations cannot be fetched from the oracle.

But if you think this is the way I won't block this PR.

I think adding more elaborate logic will not help much it still results in the same problem. The current situation is only because the problem builds up over time because we try to fetch more and more missing attestations (as seen in the grafana query you linked). We have to detect the problem early, so we can react and don't run into the load.

I would get this into production and ignore alerts for specific even-ids where we know we cannot fetch them at the moment. This will be a bit cumbersome to set up, but I think this is the way to go. We could also handle this in rust or with e.g. an ignore file, but I don't think it's right to introduce code for this. In the future we will notice the problem early and can hopefully get it fixed before we see a lot of spam.

luckysori commented 2 years ago

I suspect we are gonna drown in alerts, as we will be emitting this error log 8000 times per hour. Perhaps there is a gentler way of handling this. Something like an alert that is triggered if 1 or more unique attestations cannot be fetched from the oracle. But if you think this is the way I won't block this PR.

I think adding more elaborate logic will not help much it still results in the same problem. The current situation is only because the problem builds up over time because we try to fetch more and more missing attestations (as seen in the grafana query you linked). We have to detect the problem early, so we can react and don't run into the load.

I would get this into production and ignore alerts for specific even-ids where we know we cannot fetch them at the moment. This will be a bit cumbersome to set up, but I think this is the way to go. We could also handle this in rust or with e.g. an ignore file, but I don't think it's right to introduce code for this. In the future we will notice the problem early and can hopefully get it fixed before we see a lot of spam.

Isn't something like this sufficient?

bonomat commented 1 year ago

This project is unmaintained now.