dashersw / cote

A Node.js library for building zero-configuration microservices.
http://cote.js.org
MIT License
2.35k stars 187 forks source link

Status Logging #33

Closed brandonbird closed 7 years ago

brandonbird commented 8 years ago

I have a pretty simple Requester/Responder setup running on a single dev machine at the moment. There is a set of main app instances with a requester and a set of workers with a responder. 1) Is it expected behavior for the Requesters/Responders to go online/offline every few seconds? 2) If you set statusLogsEnabled: false for the options it only disables logging for offline status; online status is still logged (repeatedly).

dashersw commented 8 years ago

Hello! It's certainly not expected. They should be up as long as they are running. One possible culprit is this: online / offline messages don't depend on actual mutual communication being open; but rather depend on health checks and status updates. So every 2 or 3 seconds, components fire hello messages, saying that they are online. And if a component didn't receive a hello message from another component, it marks the latter as "offline". When they receive it; they mark them as "online".

So it might be that you changed these defaults, or your network has problems with multicast / broadcast (unlikely). One other problem is, if you have very high CPU load, or if you have a lot of synchronous operations, you may prevent the occurance of these hello messages. If you have a synchronous loop that runs for 4 seconds, for example, you will certainly miss a hello message. So it's natural for other components to mark this component as offline.

For the second point, I didn't know it was a case. I shall look into it and fix it.

brandonbird commented 8 years ago

Interesting. Thanks for the info. Maybe it has something to do with the way the requesters are being initialized in the app (it's built on Loopback - the requester is set up when a mixin is applied to a model. The responder is launched by PM2 separately). I'll try refactoring some stuff and see if it changes anything. Regardless, the messaging still works, a responder will still pick up a request, even though the statuses are flapping. Maybe that's just luck though.

dashersw commented 8 years ago

Well, cote is built exactly for this specific purpose; high-availability. As long as the requester doesn't die; you won't lose any messages, even if you have unstable responders.