PX4 / flight_review

web application for flight log analysis & review
https://logs.px4.io/
BSD 3-Clause "New" or "Revised" License
205 stars 198 forks source link

Site-wide denial of service in search box on browse page #257

Closed DataKinds closed 1 year ago

DataKinds commented 1 year ago

Hi all. This issue details a DoS attack which is easy for a user to accidentally perform on a live Flight Review instance

High level description

Making a request to /browse_data_retrieval seems to block requests to the rest of the app. With thousands of logs uploaded, this request can block for upwards of multiple seconds. The request is issued on every input to the search box on /browse, so it is easy for a user to accidentally bring down the Flight Review server for an extended time just by typing into the search box and leaving the page open.

This can block long enough to produce 504 timeout errors from Nginx or cause the Bokeh JS wrapper to fail to connect.

Steps to reproduce

  1. Open a log (in the /plot_app endpoint) in a new tab.
  2. Open the /browse endpoint, and open your browser's devtools to the network request tab.
  3. Begin typing in the search box until your devtools are full of network requests to /browse_data_retrieval.
  4. Switch back to the /plot_app tab and refresh.
  5. The refresh will block until the last of the /browse_data_retrieval requests are served.

(If you're unlucky and your local instance displays the same errata as below, the first request will block forever and the /plot_app will never refresh. The above 5 steps seem to work to replicate this behavior on the live http://review.px4.io/ instance though).

Errata

Sometimes, the search errors out on the /browse page with an AJAX error. This seems to immediately return the server to a working state (it probably kills the pending network connections, haven't been able to reproduce in the last hour so I can't check).

On our local instance, despite having parity with PX4/flight_review, the requests to /browse_data_retrieval seem to block forever. Not sure if this is a difference in DB setup, in browser configuration, or in deploy environment configuration, but it has forced us to remote in & hard reboot the flight review server on multiple occasions when we couldn't locate the faulty connected client.

bkueng commented 1 year ago

Hi, thanks for reporting. There is something off indeed. Do you have time to look into this a bit further?

DataKinds commented 1 year ago

I'll likely be looking into this over the next week in order to patch it in our in-house instance. I'd be happy to submit a PR upstream once that work is done.

bkueng commented 1 year ago

Cool, thanks. Changing the tornado version might already help.

bkueng commented 1 year ago

Hi @DataKinds, did you find anything?

bkueng commented 1 year ago

Fixed in https://github.com/PX4/flight_review/commit/5ced4b0848d4661d9b77a87293afebc6d4abc0c0