ndmitchell / hoogle

Haskell API search engine
http://hoogle.haskell.org/
Other
738 stars 134 forks source link

Fix traversal bug #306

Closed gaverhae closed 5 years ago

gaverhae commented 5 years ago

There is a bug in the current path handling code that allows an attacker with direct access to the hoogle server to force it to return files it has access to. This does not seem to work through an nginx proxy such as the setup on hoogle.haskell.org.

Repro on current master:

$ git clone git@github.com:ndmitchell/hoogle.git
Cloning into 'hoogle'...
[...]
$ cd hoogle
$ git rev-parse HEAD
4fc77219521055ac96eca9908b4fb9a63ec9a0c5
$ stack init
[...]
* Matches nightly-2019-05-27

Selected resolver: nightly-2019-05-27
Initialising configuration using resolver: nightly-2019-05-27
Total number of user packages considered: 1
Writing configuration to file: stack.yaml
All done.
$ stack build
[...]
$ stack exec -- hoogle generate --local
Starting generate
Reading ghc-pkg... 0.12s
[11/143] attoparsec... 0.06s
[16/143] basement... 0.23s
[35/143] cryptonite... 0.10s
[52/143] ghc... 2.40s
[76/143] memory... 0.02s
[116/143] transformers... 0.08s
[141/143] zlib... 0.03s
Packages missing documentation: hoogle rts
Found 134 warnings when processing items

Reordering items... 0.02s
Writing tags... 0.15s
Writing names... 0.14s
Writing types... 0.89s
Took 10.31s
$ echo hi > hello
$ stack exec -- hoogle server --port=8080 --local > log.txt &
[1] 35053
$ curl localhost:8080/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/hello
hi
$

This PR tries to solve that problem by bridging the existing gap between URI parsing for replays (which does attempt to prevent this kind of issue) and for the server, which for some reason use completely different code paths.

ndmitchell commented 5 years ago

Thanks for the patch. I'm going to need to study this one in detail so will review Thursday.

ndmitchell commented 5 years ago

Very nice work - thanks a lot!