mxcube / mxcubeweb

MXCuBE-Web
http://mxcube.github.io/mxcubeweb/
GNU Lesser General Public License v3.0
21 stars 38 forks source link

Ensure all REST endpoints respond with JSON #1393

Open fabcor-maxiv opened 1 week ago

fabcor-maxiv commented 1 week ago

Follow-up to https://github.com/mxcube/mxcubeweb/pull/1392. In this case we had an endpoint that was not returning JSON. We should investigate if it is possible to make all our REST endpoints return JSON content (if they return content at all), even in case of error (4xx and 5xx).

Unless we have a good reason not to return JSON for some specific cases?

Maybe it can be enforced in a systematic manner, instead of making it for each REST route individually.

And of course we should check that the JavaScript UI frontend expects JSON from the server.

elmjag commented 1 week ago

For 500 Internal Server Error replies, I I don't really see a big point of getting a JSON reply.

Normally 500 reply means that there was some unexpected exception in the back-end python code. In that case you want to see the traceback of that exception. It is very helpful when that traceback is included into the reply body, as text with correct line breaks.

If the traceback, together with request method and URL ended up in the javascript console, that would be very nice, and save time when debugging these issues.

elmjag commented 1 week ago

And of course we should check that the JavaScript UI frontend expects JSON from the server.

Frontend does expect application/json replies: https://github.com/mxcube/mxcubeweb/blob/develop/ui/src/api/addons/safeJson.js#L7

axelboc commented 1 week ago

I would generally think that sending a backend stack trace is a significant security flaw as it reveals the intervals of the application... but since the codebase is public it probably doesn't matter.

Putting this stack trace in a JSON response body still makes sense for consistency. We don't want to have to detect and handle 500 errors specifically in every try/catch block on the front-end. We want our error handling code to be simple and generic. Stringified \n line breaks can easily be dealt with.

fabcor-maxiv commented 1 week ago

For 500 Internal Server Error replies, I I don't really see a big point of getting a JSON reply.

Whenever something goes wrong on the Python backend the JavaScript UI frontend should be able to show meaningful error messages to the user, so the frontend needs to get the error message from the backend, parse it and translate it into a human readable message. That is for the parsing step that JSON makes sense. Typically there would be some JSON like this:

{
  "error": "NothingWorksError",
  "details": "Nothing works, everything is broken",
  "traceback": "..."
}

Indeed it is debatable whether pushing the traceback to the front-end makes sense or not. I would say yes, but indeed there is always a risk of leaking some sensitive info. In the case of MXCuBE I feel like there is not much sensitive info. On the other hand, being able to display the traceback in the browser might help us getting better bug reports from the users: they can send us a copy-paste of the backend traceback.

fabcor-maxiv commented 1 week ago

Right, so, not completely unexpected, there is a bit of a mess attached to this...

Apparently, according to @elmjag, there are many cases where an error happens on the back-end, the trace-back is NOT logged on the back end (!!?!?) but the trace-back is sent to the frontend as 500 (or even 200 in some cases!?!?) as plain text.

  1. So clearly, the logging on the Python back end side is in an awfully bad shape. This needs to be fixed. Exceptions should be logged!
  2. The somewhat "advantage" of exceptions being sent to the frontend as plain text instead of JSON, is that the traceback is at least easy to read in the browser console! Ha! Indeed reading a traceback in a JSON document is way more annoying because the new lines are replaced by literal \n and the whole traceback is on a single line.
axelboc commented 1 week ago

I'm not 100% opposed with having 500 stack traces as text; we can always work with it somehow. If it's JSON, we can register a global listener for unhandled errors to make sure they are always well formatted when logged to the console.

Whatever we decide, it just has to be consistent.

fabcor-maxiv commented 1 week ago

I'm not 100% opposed with having 500 stack traces as text

That is definitely not my favourite solution. But as Elmir pointed out, it has the advantage of being immediately readable in the web developer tools without having to do anything. That is why we mentioned it.

If it's JSON, we can register a global listener for unhandled errors to make sure they are always well formatted when logged to the console.

Yes, that sounds good, that would be my preferred solution. For the JavaScript frontend side, having all the 400s and 500s logged properly in the web developer console (with readable traceback if any) would be great. I guess it is partially the case already, and I assume it should be relatively straightforward to change whatever we want to change in a centralized manner...

... but on the backend, I am a bit worried about the situation, especially the logging of exceptions. We'll see what we can do there.

elmjag commented 1 week ago

Just to clarify the situation on tracebacks.

Currently, the tracebacks in API handlers only go into the response body.

The only practival way to fish it out is by finding the request in the network tab of developer tools. Which I may add is still not trivial, as the status code is still 200. Which means it's not obvius at all which request failed.

Of cource, we should improve the current state of affairs. It shoud be easy to figure out which API request fails. Prefereably you should be able to see the nicely formated traceback somewhere as well.

marcus-oscarsson commented 1 week ago

Yes, It would be great if we could improve the exception handling and it is really not wonderful that some exceptions gets hidden.

I would actually prefer if the exceptions are logged properly with the python logger and not sent to the front end. We have not made any effort in that direction earlier but there is no real reason for sending a stack-trace to the client, other than debugging or possibly being very (too :) ) transparent with the user.

I assume all developers also have access to the application log and look at it while developing/debugging so that would be enough ? I also don't have anything against if we send the stack-trace to the front-end but I don't really see the use of it.