Open joonas-fi opened 4 years ago
My intentional goal with this code was that any GUN peer could act as a CDN for delivering GUN & related modules, and that radata
file system could be remotely read for possible performance improvements and/or remote storage adapters.
But your https://github.com/amark/gun/issues/880 is accidental (community projects copy pasted code before your security vulnerability), so I will retire those files & contact owners of the other repos. Tho to be perfectly honest, I don't know how many other projects might have copies.
Ideally, I'd like to keep the GUN repo/root as a public/CDN, in the same way it is Open Source here.
Most developers know that data in GUN is public unless they encrypt it with https://gun.eco/docs/SEA , in which case the data is end-to-end encrypted so even relay peers can't decrypt the data in the radata
files.
However, most developers probably do not know how CDN-ish the server behaves, they probably only think/aware the examples/some-files are available, and not lead that to the conclusion of everything is CDNed.
I will update the docs next couple weeks, not sure how much that will improve awareness tho.
Unimportant last remarks: Most peers are deployed-via-GitHub/etc., or similar, so most devs are not gonna be running a forked version of GUN with secret files - if they do, they'd have GUN as an NPM dependency where they have their own app root. I've also started forcing people to use environment variables for any secrets rather than changing source code to add them or hardcoding them.
Proposed Solution: I retire https://github.com/amark/gun/issues/880 files, put a heavy warning in README + docs that servers are open / CDN-ify everything. (It is an important goal of mine that there is no "secret server side security" that ever happens, I'm a strong believer in peer-to-peer end-to-end encryption, never trust another machine to do security for you, and for me trying to keep everything open is a forcing hand, including being able to theoretically audit the code a CDN peer is running [tho yeah, they could lie about it].)
@joonas-fi ^ let me know your thoughts, please be harsh. Thanks!
This behaviour is very unconventional and surprising - it came as a very big surprise to me. Usually there's a subdirectory named static/
or public/
that is exposed to the internet.
This looked like a bug to me also because if this "serve everything" was by design then I would have expected it to handle error conditions as well (which are common to happen if you're serving a large amount of files). F.ex. requesting a file that doesn't exist doesn't give any response: http://gunjs.herokuapp.com/gun/thisShouldBe404 Which Heroku proxies just eventually terminate as timeouts:
But unconventional by itself is not bad, if you just make it un-surprising by making sure all the users are informed ("put a heavy warning in README + docs that servers are open / CDN-ify everything") then this is not un-reasonable from user security standpoint.
It is unconventional for centralized systems (most systems)! But not for stuff like BitTorrent, etc.
GUN is a P2P/decentralized database, anybody/anyone can replicate the data, intentionally, security is enforced through encryption & real cryptography with SEA.
For example, I'm hoping most people know that with Bitcoin all data there is stored publicly, so if one wants to add a private message to a transaction, it must be independently encrypted.
Same is true here.
So I strongly do not want people thinking "hiding files behind a server" is somehow an acceptable security model, it isn't, so I don't do it. I also just don't like the idea of closed-source systems that cannot be inspected (or audited!) from an external view.
I'll add a notice to the README that it CDN-ifies everything.
For clarity to others and as a future reference, it does NOT happen upon:
require
ing gun.Gun(opt)
initializing gun.Gun({web: server
even passing a server to gun.It DOES happen if you use the examples, npm start
, auto-deploy, etc. or if you pass Gun.serve
/ Gun.serve(__dirname)
explicitly to http.createServer
or some Express/etc. equivalent.
Errors serving handled here:
https://github.com/amark/gun/blob/50ee5c766f686da6ea2a67b5edb0e4ae15ba311c/lib/serve.js#L10-L14
Tho looks like I need to add that redirect to the try/catch
code further on below in the file (thanks for pointing that out!).
README updated!
https://github.com/amark/gun/commit/9dbbf856d33025e94760eb251705081b1ba990b3
Move to close: I know many may disagree with me on CDN-ification (I think that is fair), it is easy for them to avoid it by requiring & calling Gun(
. The biggest problem has been lacking documentation, which has been fixed.
Happy to leave this open for more exposure, but I also tweeted about it, so will close in the future unless people object.
I feel the wording is still not clear enough:
CDN-ify all GUN files, modules, & storage.
I'd explicitly mention "all the database content, files and subdirectories from the project/repo's root directory are remotely readable over HTTP"
I'd like to hear more of your thoughts on some of these subjects:
To improve on my point, I can re-arrange my suggestion: "all the files, subdirectories from the project/repo's root directory and raw database content are remotely readable over HTTP"
My main point was not the database content. An example of what I worry about:
Procfile
file format, which there is an encouraging example file for: https://github.com/amark/gun/blob/master/ProcfileI want to use S3 as the storage backend, since the repo provided an example procfile (as an encouragement to use one) and procfiles are often used to relay ENV vars to the app-to-be-started, I'll modify the procfile to have:
web: AWS_ACCESS_KEY_ID="foo.." AWS_ACCESS_KEY_SECRET="bar.." node --optimize_for_size --gc_interval=100 examples/http.js
Now a malicious user can fetch my procfile and steal my secrets. GUN's surprising security model betrayed my expectations as a user.
Does the user really understand that if the user edits the example Procfile in its example location, it's dangerous to add secrets to it?
Please don't take my procfile example as the only example where a user gets in danger. Adding "don't add secrets here" -comment to the procfile is not the solution. I'll give a credentials.env
example later (think: any file the user puts in the root directory).
All because I still didn't understand from..
CDN-ify all GUN files, modules, & storage.
.. that it means that all the directory's files are remotely readable. Do you really expect all your users to understand what "CDN-ify" means? Even if I know what a CDN means, I still feel it's vague - isn't "remotely readable over HTTP" simpler and easier to understand?
Let's dissect what you disclose as being readable:
GUN files
- what files are "GUN files"? All source code? Source code including readmes etc.?
credentials.env
which I'll source (think Bash $ source ./credentials.env
) from my process startup script will become remotely readablemodules
- this probably means node_modules/storage
- this maybe means database content, but still it's less clear way than just saying "raw database content"GUN works in an unconventional way (users don't expect project's root directory to be remotely readable) which is dangerous for the user if the user does not understand the implications. Therefore, this needs to be communicated in as clear way as possible. Currently we disagree on how to communicate this clearly.
https://github.com/amark/gun/blob/50ee5c766f686da6ea2a67b5edb0e4ae15ba311c/lib/serve.js#L29
(Sidenote: also the code under that looks vulnerable to path traversal, but it isn't because
CDN()
is called before which mutates the request object with sanitized URL - this is very unconventional practice and makes following/auditing/maintaining code hard)If request begins with
/gun/
, the rest of URL is served from GUN's root directory (same as this repo's root).Is this something that the users are aware of? Can you say for certain that users won't store config files/secrets under that directory? To me this is unconventional behaviour, and each user would need to be aware that everything under (including subdirectories) in this dir are world-readable.
I was able to read the
stats.radata
file that is written to the app directory: http://gunjs.herokuapp.com/gun/stats.radataSame goes for
Procfile
etc etc (I could very well see myself thinking this is a file in which I would think saving secrets in is ok): http://gunjs.herokuapp.com/gun/ProcfileAlso, my own test instance with default config writes database files to
<root>/radata/
. One of the files in there is!
(the next is%1C
- I don't know the naming behaviour) - I was able to read this with$ curl -v --path-as-is http://localhost:8765/gun/radata/!
. Is it intended behaviour that the whole database on-disk format is world-readable?