racket / racket-pkg-website

A frontend for the Racket Package Catalog.
Other
11 stars 16 forks source link

Error when deleting a package #28

Closed SuzanneSoy closed 7 years ago

SuzanneSoy commented 7 years ago

When deleting a package, the deletion seems immediate, but the page takes forever to answer. I tried letting it finish, and got the following error:

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /pkgn/package/hyper-literate-broken/edit;(("k" . "(177 3 52879818)")).

Reason: Error reading from remote server

Apache/2.4.7 (Ubuntu) Server at pkgd.racket-lang.org Port 443

This is not a big problem, as the deletion seems to work successfully nevertheless.

tonyg commented 7 years ago

This problem can occur at least in the situation when the package-change-handler thread is busy (perhaps rerendering a large number of packages after an update). The package-deletion page is trying to ensure some kind of "read-your-writes" consistency by waiting for the deletion to definitely have propagated through the system before redirecting the user back to the main page. In the case I'm seeing, I think this leads to a long delay because the package-change-handler thread is too busy with other things. I will see if something similar happens when the package-change-handler thread is idle; if so, then my hypothesis is wrong and I will have to think harder about this.

If the hypothesis is correct, one mitigation might be for the package-change-handler to prioritise incoming small requests over long-running background refreshes somehow. I'm not sure if that'd work well. The design of the program is complicated in that little area.

tonyg commented 7 years ago

Nope, it happens even when package-change-handler is otherwise idle. But it's related to what I thought. Here's a stack trace from deleting a package:

2016-12-21 01:28:29.856860500 *** DAEMON CRASHED: static-renderer ***
2016-12-21 01:28:29.856861500 file-size: file not found
2016-12-21 01:28:29.856861500   path: /var/lib/pkgserver/public_html/pkg-index-static/pkg/digicore
2016-12-21 01:28:29.856862500   context...:
2016-12-21 01:28:29.856862500    /home/pkgserver/racket/collects/racket/file.rkt:723:0: file->bytes119
2016-12-21 01:28:29.856862500    /home/pkgserver/racket-pkg-website/src/static.rkt:213:2: for-loop
2016-12-21 01:28:29.856863500    /home/pkgserver/racket/collects/racket/private/more-scheme.rkt:261:28
2016-12-21 01:28:29.857183500    /home/pkgserver/racket-pkg-website/src/static.rkt:234:0: static-renderer-aws-s3
2016-12-21 01:28:29.857184500    /home/pkgserver/racket-pkg-website/src/static.rkt:120:0: static-renderer-main
2016-12-21 01:28:29.857185500    /home/pkgserver/racket/collects/racket/private/more-scheme.rkt:261:28
2016-12-21 01:28:29.857437500 
2016-12-21 01:28:29.857667500 *** DAEMON CRASHED: package-change-handler ***
2016-12-21 01:28:29.857667500 RPC exception:
2016-12-21 01:28:29.857667500 file-size: file not found
2016-12-21 01:28:29.857668500   path: /var/lib/pkgserver/public_html/pkg-index-static/pkg/digicore
2016-12-21 01:28:29.857668500   context...:
2016-12-21 01:28:29.857669500    /home/pkgserver/racket/collects/racket/file.rkt:723:0: file->bytes119
2016-12-21 01:28:29.857669500    /home/pkgserver/racket-pkg-website/src/static.rkt:213:2: for-loop
2016-12-21 01:28:29.857669500    /home/pkgserver/racket/collects/racket/private/more-scheme.rkt:261:28
2016-12-21 01:28:29.857953500    /home/pkgserver/racket-pkg-website/src/static.rkt:234:0: static-renderer-aws-s3
2016-12-21 01:28:29.857954500    /home/pkgserver/racket-pkg-website/src/static.rkt:120:0: static-renderer-main
2016-12-21 01:28:29.857954500    /home/pkgserver/racket/collects/racket/private/more-scheme.rkt:261:28
2016-12-21 01:28:29.857955500 
2016-12-21 01:28:29.857955500   context...:
2016-12-21 01:28:29.857955500    /home/pkgserver/racket-pkg-website/src/rpc.rkt:31:0: rpc-call
2016-12-21 01:28:29.858243500    /home/pkgserver/racket-pkg-website/src/site.rkt:1531:21
2016-12-21 01:28:29.858244500    /home/pkgserver/racket/collects/racket/private/more-scheme.rkt:261:28
2016-12-21 01:28:29.858430500 
tonyg commented 7 years ago

(Saving a package when the package-change-handler is busy is also problematic. So my original hypothesis stands, kind of. There are then two problems to fix: one re: trying to read deleted files, and one re: blocking urgent work behind a stack of less-urgent stuff.)

tonyg commented 7 years ago

We have two problems:

  1. deleting a package crashes in the UI server
  2. generally, requests to the package-change-handler can get stuck behind low-priority tasks

Here's an analysis of why deleting a package crashes in the UI server.

There's a TOCTTOU bug in that last step. In addition, the assumption in the penultimate step is invalid.

Potential fixes:

  1. make the synchronous RPC block until all of the necessary causal consequences will be visible to callers
  2. special-case deletion of a package to remove the static-path copy separately
  3. add a gratuitous pause in the UI server to let this server "usually" catch up
  4. poll the static-path directory in the UI server for changes
  5. refactor the server so it backgrounds only things without causal consequences

All of these require separate resolution of the TOCTTOU bug, which is independent of the bigger causal stuff.

Fix (1) is no good, because the inter-package interaction refresh might do rather a lot of work.

Fix (2) is no good, because it's a special-casing which works around just this one manifestation of the problem.

Fix (3) is no good, because it's a grody hack that will sometimes fail mysteriously.

Fix (4) is kind of disappointing but basically OK.

Fix (5) would be ideal but it's a long-term idea that I don't have time to pursue.

So I'll go with fix (4), probably.


To address the second, priority-confusion problem, I'll have to refactor package-change-handler in the UI server. I haven't done the necessary planning yet.

tonyg commented 7 years ago

Happily, a form of polling already exists, in that the package database is refetched every few minutes in its entirety. So I just need to fix the TOCTTOU problem to get deletion to stop faulting, I think.

tonyg commented 7 years ago

With those commits, I think this is fixed. @jsmaniac, please feel free to reopen or file new bugs if you notice further problems, and thank you for the report!