HDFGroup / h5serv

Reference service implementation of the HDF5 REST API
Other
168 stars 35 forks source link

Unauthorised domain access/creation #115

Open swingingsimian opened 7 years ago

swingingsimian commented 7 years ago

Hi John

Have started a new thread as this is a bit tangential to the site wide/default root ACL. I managed to set a default read only root acl for the entire base domain as detailed here: https://github.com/HDFGroup/h5serv/issues/105#issuecomment-310061548

I am attempting to test this by submitting an userless request to create a new file in an existing 'private' domain/folder:

import h5pyd as h5py
hfile = h5py.File('test_unauth.private.cegx.co.uk', 'w')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/site-packages/h5pyd-0.1.0-py3.5.egg/h5pyd/_hl/files.py", line 185, in __init__
    raise IOError(rsp.status_code, rsp.reason)
OSError: [Errno 500] Internal Server Error
>>> hfile = h5py.File('test1_unauth.private.cegx.co.uk', 'w')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/site-packages/h5pyd-0.1.0-py3.5.egg/h5pyd/_hl/files.py", line 185, in __init__
    raise IOError(rsp.status_code, rsp.reason)
OSError: [Errno 500] Internal Server Error

I had to fix up tocUtil.py a little to get this far (I can send a patch/pull request), but I still get this response:

HTTPServerRequest(protocol='http', host='test1_unauth.private.cegx.co.uk', method='PUT', uri='/', version='HTTP/1.1', remote_ip='127.0.0.1', headers={'Host': 'test1_unauth.private.cegx.co.uk', 'Accept': '*/*', 'Content-Length': '4', 'User-Agent': 'python-requests/2.18.1', 'Connection': 'keep-alive', 'Accept-Encoding': 'gzip, deflate'})
Traceback (most recent call last):
  File "server/app.py", line 3000, in put
  File "/usr/local/src/h5serv/server/tocUtil.py", line 128, in addTocEntry
    raise e
  File "/usr/local/src/h5serv/server/tocUtil.py", line 108, in addTocEntry
    raise IOError(errno.EACCES)  # unauthorized
OSError: 13

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tornado/web.py", line 1509, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "server/app.py", line 3002, in put
TypeError: Can't convert 'NoneType' object to str implicitly
ERROR:tornado.access:500 PUT / (127.0.0.1) 25.90ms
INFO:h5watchdog.py:27::H5EventHandler -- Created file: ../data/private/test1_unauth.h5
INFO:h5watchdog.py:45::H5EventHandler -- Modified directory: ../data/private
INFO:h5watchdog.py:45::H5EventHandler -- Modified file: ../data/private/test1_unauth.h5
INFO:h5watchdog.py:45::H5EventHandler -- Modified file: ../data/private/test1_unauth.h5
INFO:app.py:3233::process_queue, got: ../data/private/test1_unauth.h5
INFO:app.py:3198::updateToc(../data/private/test1_unauth.h5)
INFO:app.py:3211::base domain: test1_unauth.private.cegx.co.uk
INFO:tocUtil.py:85::addTocEntry - domain: test1_unauth.private.cegx.co.uk filePath: ../data/private/test1_unauth.h5
INFO:tocUtil.py:91::tocFile: ../data/.toc.h5
INFO:hdf5db.py:163::init -- filePath: ../data/.toc.h5 mode: r+
INFO:hdf5db.py:194::Hdf5db __enter
INFO:hdf5db.py:713::getUUIDByPath: [/]
INFO:hdf5db.py:3047::db.getLinkItemByUuid(e545b29e-5680-11e7-a95c-0242ac110002, [private])
INFO:hdf5db.py:769::getGroupObjByUuid(e545b29e-5680-11e7-a95c-0242ac110002)
linkName: test1_unauth
INFO:tocUtil.py:109::createExternalLink -- uuid e5482664-5680-11e7-a95c-0242ac110002, domain: test1_unauth.private.cegx.co.uk, linkName: test1_unauth
INFO:hdf5db.py:769::getGroupObjByUuid(e5482664-5680-11e7-a95c-0242ac110002)
INFO:hdf5db.py:198::Hdf5db __exit

The file is actually created, it just appears that the toc entry failed. So it appears that the root ACL is not being applied to existing domains, nor is it restricting the creation of new domain.

Am I misunderstanding how the default root ACL works, or how non-public domains work? Is it possible to lock this down so only authenticated users can read/write/create new domains/files?

From doing an initial dive into the code I see that the RootHandler does verifyAcl for 'put' via getRootResponse, but this is hardcoded for 'read' permission. I will try and patch this up to pass the perm name through to getRootResponse.

Please holler if this sounds wrong.

Thanks

jreadey commented 7 years ago

The ACL in the .toc.h5 is controlling who can update the toc, but doesn't have a bearing on the file in your "private" domain.

Longer term I'd suggest moving to the HSDS approach as outlined here: https://github.com/HDFGroup/h5serv/issues/105.

If the immediate concern is to prevent anonymous users from creating new files on the server, I can implement a config option to prevent that. What that suffice as a near term fix?

swingingsimian commented 7 years ago

Hi John

I would be interesting in seeing the proposed config option. This may well be all we need. Re HSDS, we are still discussing this internally, but we have a few issues with the state of the project at present, lack compression would be a killer for us, so we'd need to wait for that at least.

Thanks

jreadey commented 7 years ago

Ok, I'll take a crack at the config option today. Compression for HSDS will be coming soon, I'll keep you posted.

jreadey commented 7 years ago

I've checked in an update to the develop branch that adds a new config option: "new_domain_policy".

This can be one of three value: ANON: anonymous users can create domains (files) AUTH: only authenticated users can create domains NEVER: new domains can never be created via the REST API

The default is ANON, but you can modify config.py in the server directory or override this on the command line when you start h5serv. E.g.:

$ python app.py --new_domain_policy=NEVER

Let me know if this works for you.

swingingsimian commented 7 years ago

Hi John

Great, I tested this with --new_domain_policy=AUTH, and it seemed to work perfectly. Thanks very much.

As I mentioned previously, all this digging is due to a requirement to lock down access to the server. The docs suggest that if there is an ACL, then authentication is required: http://h5serv.readthedocs.io/en/latest/Authorization.html?highlight=authentication

But it seems this is not quite true. As anonymous/unauthenticated users seem to get assigned the default ACL. So the only way to get around this at present is to use the 'default' user to set a 'no perms' ACL, and then specifically add a read only ACL for every sub domain and for every user. This seems like the wrong way to do it.

Am I missing something here? Should I start a new issue?

Thanks

jreadey commented 7 years ago

You are correct, anonymous request were accepted unless the default ACL of the file prohibited it.

I've just added another config option, "allow_noauth", that you can turn on to return a 401 for any anonymous requests.

So starting the server in "lock down mode" would be:

$ python app.py --new_domain_policy=AUTH --allow_noauth=False

Does this work for you?