Yesterday restish failed to refresh an API definition, and it took me an hour to figure out what happened.
I'll go through the debugging story and make a few suggestions on how this could be avoided.
Because these suggestions might be controversial, I've chosen to open this card to discuss, rather than make a PR.
I can make a PR for whatever direction we chose to go (if any).
The issue happened as follows:
I tried to make a restish call with an extra Content-Type header passed in as a CLI flag
Restish checked the API definition cache, noticed it was expired and decided to refresh it by loading the definition from the server
Restish applied the Content-Type header to the call to /openapi.json
My upstream honoured the Content-Type header and sent down the definition in the requested (not json) format
Restish stored the response in its file based CachedTransport
Restish failed to parse the definition
When I ran restish again, without the extra header, it would fail again, because it would serve the response from the CachedTransport (I suppose the headers are not part of the cache invalidation key).
If you run with debug logging on this is invisible: there's no way to tell that restish did not in fact run the request, but straight up loaded the response from the cache.
This lead me to spend a considerable amount of time looking at my upstream server to figure out why content-negation was broken.
I fixed restish by explicitly passing --rsh-no-cache.
Clearly this is an esoteric edgecase with a simple workaround, so I understand if we chose not to fix it.
However, I would suggest the following changes to prevent this and similar issues:
I think we should not apply headers passed in from the CLI to the remote call fetching the definition. We should only use the one's defined in the API config. The headers are almost always going to passed to modify the behaviour of the API call, not the service discovery call. In the best case they are ignored by the upstream, but they can lead to unintended consequences like this.
I don't think we need 2 caches. We already cache the resulting parsed API definition, which eliminates most calls to the upstream to fetch the openapi document. The additional cached transport doesn't add any extra value, but it can break things. Especially if restish failed to parse the spec, it should not reuse the a cached response from the server.
Yesterday restish failed to refresh an API definition, and it took me an hour to figure out what happened.
I'll go through the debugging story and make a few suggestions on how this could be avoided. Because these suggestions might be controversial, I've chosen to open this card to discuss, rather than make a PR. I can make a PR for whatever direction we chose to go (if any).
The issue happened as follows:
Content-Type
header passed in as a CLI flagContent-Type
header to the call to/openapi.json
Content-Type
header and sent down the definition in the requested (not json) formatWhen I ran restish again, without the extra header, it would fail again, because it would serve the response from the
CachedTransport
(I suppose the headers are not part of the cache invalidation key). If you run with debug logging on this is invisible: there's no way to tell that restish did not in fact run the request, but straight up loaded the response from the cache. This lead me to spend a considerable amount of time looking at my upstream server to figure out why content-negation was broken.I fixed restish by explicitly passing
--rsh-no-cache
.Clearly this is an esoteric edgecase with a simple workaround, so I understand if we chose not to fix it.
However, I would suggest the following changes to prevent this and similar issues: