Open amandam11 opened 1 year ago
HI Amanda, questions like these are good because then I can add extra documentation to the "How the SDK works" docs :-)
The docs for repository readiness are here: https://docs.featurehub.io/featurehub/1.7.0/sdks-development.html#_repository_readiness
Because the repository is a cache, which depending on how the connector connects and updates - may be out of date for a period of time, its important that it stays Ready once it has received its first set of features. So once it moves from Not Ready -> Ready, it shouldn't ever (except in one circumstance) move back to Not Ready. It would only change to 'Failed".
The situation whereby it will return to not ready is when you are using a server evaluated key and you change the context and "build" - then the repository does not reflect in any real sense the right feature values, so it swaps back to NotReady.
Its quite important particularly for node servers not to stop being ready otherwise if your FeatureHub install goes down, then all your server apps would go down as well, and if your FeatureHub install is down, its not updating any features so its not really a problem there is no connection.
HI Amanda, questions like these are good because then I can add extra documentation to the "How the SDK works" docs :-)
The docs for repository readiness are here: https://docs.featurehub.io/featurehub/1.7.0/sdks-development.html#_repository_readiness
Because the repository is a cache, which depending on how the connector connects and updates - may be out of date for a period of time, its important that it stays Ready once it has received its first set of features. So once it moves from Not Ready -> Ready, it shouldn't ever (except in one circumstance) move back to Not Ready. It would only change to 'Failed".
The situation whereby it will return to not ready is when you are using a server evaluated key and you change the context and "build" - then the repository does not reflect in any real sense the right feature values, so it swaps back to NotReady.
Its quite important particularly for node servers not to stop being ready otherwise if your FeatureHub install goes down, then all your server apps would go down as well, and if your FeatureHub install is down, its not updating any features so its not really a problem there is no connection.
Thank you for your explanation, and quick response. Sorry I missed that in the docs. I misunderstood from reading the SDK README (https://github.com/featurehub-io/featurehub-javascript-sdk#what-is-the-deal-with-readyness), so a link from the README to the page you suggested above might make it clearer for others to avoid confusion. It would be nice if from the logs we could tell that we were using cached features. For example if there was a network issue that meant that we couldn't reach FeatureHub, but FeatureHub was up, at the moment we wouldn't realise this without having featurehub trace on. Or a way for us to test to see if we have an active connection, that we can then log. So that in case of a network issue we can see that.
If I exposed the Edge connector as a queryable status for a prometheus metric, would that be more useful? Or should we do both? I am thinking of looking for an env var that would trigger logging connection state?
Oh, and by the way, you didn't miss it in the docs :-) I wrote it to clarify how the readiness works essentially as a FAQ and for any new SDKs, we keep adding to the SDKs docs as we build new SDKs.
Would maybe a different FHLog level for Edge connector issues be more suitable then you can control it? Suggestions are welcome!
Oh, and by the way, you didn't miss it in the docs :-) I wrote it to clarify how the readiness works essentially as a FAQ and for any new SDKs, we keep adding to the SDKs docs as we build new SDKs.
Great addition to the docs then -> it's nice and clear!
What we've ended up doing is having to pick out the logs whereby we see "refreshing connection in case of staleness" AND it has error.message - as that seems to be the case whereby we've actually got a problem connecting, compared to the "refreshing connection in case of staleness" which just has {"type":"error"} as we didn't want to also log all the standard refresh connection messages. Then to know we've got it back then we need to find the ack messages, but again to not log all the standard stale connection re-establishment, we couldn't distinguish between re-establishment due to just the standard drop/connect, and a re-establishment based on actually having a problem with the connection. So we ended up having to keep track of whether we'd previously had an error connection, so keeping state in our abstraction layer.
We ended up, deciding logging was enough - so some easier way of picking up the logs would make it easier, so if those were at a new log-level that would make it easier. Ideally if it was easier to find out if we were recovering a previously failed connection, compared to a just refresh from staleness would be great. But I'm not sure if actually there really isn't any difference, as you only notice the failure because you are refreshing the connection - so essentially the logic really is the same in both cases.
Ok, I'll have a think about how I can better surface this as SSE is often quite tricky to ascertain if you got an actual error or the connection dropped because of staleness as you say. Maybe a threshold of failure? like if you get a reconnect failure X times in a row we should start logging it?
Hey, Somewhat related to this, is it possible to identify an issue after connectivity was established? For example, say there's a networking issue to FeatureHub server and not the server itself. (BTW, I'm using the node.js SDK).
I ran a local node.js app that successfully connected to FH server, and then I just disconnected my LAN cable to see if some kind of error will appear, but didn't get anything in the logs nor did the readiness status change.
I think it's important because the FH server might still be up&running allowing users to still change flags but the app itself doesn't get updates.
Yes, it is probably worthwhile to add a callback when there is any kind of connectivity issue. There are some changes coming down the line around how SSE and polling are handled as well that have been trialled in a couple of other SDKs so we will be sure to add this capability in.
Describe the bug If my client code successfully manages to connect to FeatureHub at startup, then readiness is reported as Ready. I have added both a timer to regular check readiness, and also registered a ReadinessListener. However, if I stop featurehub then the client still always report that the readyness is true.
Which area does this issue belong to?
To Reproduce Steps to reproduce the behavior:
The readyness listener has a simple log statement to report the readyness, eg..
console.log('Raw listener readyness is now ' + readiness);
The timer also reports readyness, and simply has:
Whereby the VERBOSE log is from the fhLog trace, and the readynesslistener is reporting the NotReady and Ready, and the timer then goes off and sees ready.
VERBOSE received ack,[object Object] VERBOSE received features,[object Object] LOG Readyness is now Ready
[{"type":"error"}]
[{"type":"error","message":"getaddrinfo EAI_AGAIN featurehub"}]