FerretCode / locomotive

6 stars 2 forks source link

suggestion: error rate / disabling crash on error count #11

Open arnorhs opened 2 months ago

arnorhs commented 2 months ago

So locomotive frequently crashes for me in railway. it "chugs along" fine for many days, but then there's a series of errors every now and then when connecting to the railway API, with something like this in the log:

err:
["failed to WebSocket dial: expected handshake response status code 101 but got 502"](https://railway.app/project/58cd7410-04d6-4c6e-a74d-fa7c78455231/logs?filter=%40deployment%3A9992f1a0-dbe6-4830-9089-f49030b7eb2d+-%40replica%3A176b49c7-30ab-4dc9-8dfc-a39ba8d54d8f+%40err%3A%22failed+to+WebSocket+dial%3A+expected+handshake+response+status+code+101+but+got+502%22)

and i've upped the error limit env variable, but that just delays the problem.

it would be nice if the error rate tracking would take time into consideration before crashing. either that or just disabling the feature all together? and just let it periodically error without an issue.

this is not like a service error, this just a network connectivity issue

brody192 commented 2 months ago

I have implemented a different error-handling method that should be far more robust since I do see this error on my service too.

Testing in my project until I see some errors and will PR if that goes well.