HDFGroup / h5serv

Reference service implementation of the HDF5 REST API
Other
168 stars 35 forks source link

Site-wide ACL #105

Closed ghost closed 7 years ago

ghost commented 7 years ago

I have a use-case where I need the ability to manage ACL that get applied to all domains. @jreadey based on your comment, the server could fallback to the .toc.h5 file ACL whenever the domain specific ACL pass?

swingingsimian commented 7 years ago

Hi John Was any progress made on this? I see that there are references to a 'root' ACL in the docs, however, the setacl.py script does not support this. And setting through the REST API seems not to work.

Via telnet:

PUT /acls/default HTTP/1.1
Content-Type: application/json
{"read": "True", "create": "True", "update": "True", "delete": "True", "readACL": "True", "updateACL": "True"}
HTTP/1.1 500 Internal Server Error
Content-Length: 1019
Date: Mon, 19 Jun 2017 10:05:15 GMT
Access-Control-Allow-Origin: *
Content-Type: text/plain
Server: TornadoServer/4.3

Traceback (most recent call last):
  File "server/app.py", line 966, in put
  File "/usr/local/lib/python3.5/site-packages/tornado/escape.py", line 93, in json_decode
    return json.loads(to_basestring(value))
  File "/usr/local/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tornado/web.py", line 1443, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "server/app.py", line 968, in put
AttributeError: 'JSONDecodeError' object has no attribute 'message'

Or via postman (chrome app), with auth set for 'admin' user. Direct to root acls endpoint:

http://kermit:5000/acls
Content-Type:application/json
Authorization:Basic YWRtaW46YWRtaW4=
{"read": "True", "create": "True", "update": "True", "delete": "True", "readACL": "True", "updateACL": "True"}

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tornado/web.py", line 1443, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "server/app.py", line 945, in put
  File "/usr/local/lib/python3.5/site-packages/tornado/escape.py", line 170, in url_unescape
    return unquote(to_basestring(value), encoding=encoding)
  File "/usr/local/lib/python3.5/urllib/parse.py", line 638, in unquote_plus
    string = string.replace('+', ' ')
AttributeError: 'NoneType' object has no attribute 'replace'

Or root group acls endpoint:

http://kermit:5000/groups/acfa8ae6-529c-11e7-b015-0242ac110002/acls
Content-Type:application/json
Authorization:Basic YWRtaW46YWRtaW4=
{"read": "True", "create": "True", "update": "True", "delete": "True", "readACL": "True", "updateACL": "True"}
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tornado/web.py", line 1443, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "server/app.py", line 945, in put
  File "/usr/local/lib/python3.5/site-packages/tornado/escape.py", line 170, in url_unescape
    return unquote(to_basestring(value), encoding=encoding)
  File "/usr/local/lib/python3.5/urllib/parse.py", line 638, in unquote_plus
    string = string.replace('+', ' ')
AttributeError: 'NoneType' object has no attribute 'replace'

Thanks

jreadey commented 7 years ago

Regarding the issue with setting the root ACL, it looks like you have a JSON encoding error. Take a look at h5serv/test/integ/acltest.py. The setupAcls function is using the same permission string you have in your example and the request seems to work fine.

Regarding, an ACL that manages all domains, I've taken a different approach with the HDF Server follow on project, HSDS. I'd be interested to see if there is support for back porting it to h5serv.

Here's how the HSDS deals with it:

There are a small set of "top-level domains" that are setup by the service administrator (they need access to the service instance to do this). For example: "/home". The top-level domain ACL will also be defined by the administrator. Anyone with "create" permission for the top-level ACL can create a sub-domain, e.g. "/home/mydata". By default the ACL for mydata is a copy of the parent ACL. Once created the owner of the sub-domain can modify the ACL as they see fit. (say remove public read permission). similarly sub-sub-domains ("/home/mydata/foo") will inherit the permission of it's parent ("/home/mydata").

Also in HSDS I've done away with .toc and instead have the concept of "folders". Folders are just domains without a root group. Using the rest api, you can list the contents of a folder (which will be a collection of folders and/or domain objects).

It would be great to keep REST API compatible between h5serv and hsds, but this would be a pretty big change for h5serv clients, so any feedback would be welcome.

Brief background on HSDS: It's designed as a scalable HDF Server (server can run as a cluster) and uses AWS S3 for persistent storage. As such h5serv will still be useful for those who wish to have a simple HDF server that can easily be run on a desktop while HSDS is more suited to cloud deployments.

swingingsimian commented 7 years ago

Hi John, thanks for getting back to me.

I think the json encoding error is a red herring related to telnet, as it validated through postman just fine.

I was already looking at the acltest.py script to try and get a handle on what the problem is. I was having problems with this, but can now get this to run successfully by omitting the --datapath option to server/app.py. (Hence the comment edit).

Can you just clarify something for me. In h5serv is it possible to set a root ACL on the base domain (i.e. 'hdfgroup.org'). The scripts seem to be file focussed as does acltest.py. What I want to do is set a read only root acl, to prevent any one from being able to access the server without a user authentication.

Thanks

swingingsimian commented 7 years ago

I see the acltest.py script uses the following uri for setting the default root acl: req = self.endpoint + "/acls/default" I also tried with this with my REST client, converted the string boolean values to valid json bareword booleans. This seems to have got past the apparent json encoding issues, but I still get a server error, in the response:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/tornado/web.py", line 1509, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "server/app.py", line 982, in put
tornado.web.HTTPError: HTTP 400: Bad Request: 'perm' not found in request body

and the log:

INFO:app.py:286::tocFilePath: ../data/.toc.h5
INFO:app.py:321::verifyFile: ../data/.toc.h5
INFO:fileUtil.py:266::verifyFile('../data/.toc.h5', False)
INFO:app.py:188::baseHandler, href: http://kermit:5000
INFO:app.py:194::REQ PUT http://kermit:5000/acls/default {remote_ip: 10.0.2.49}
INFO:app.py:981::Bad Request: 'perm' not found in request body
WARNING:tornado.access:400 PUT /acls/default (10.0.2.49) 4.96ms

On closer inspection of the acltest.py script I see the json is nested in hash with a single 'perm' key. This should probably be mentioned here:

http://h5serv.readthedocs.io/en/latest/AclOps/PUT_ACL.html

And the docs here about accessing the root acl are a bit ambiguous wrt actually including 'default' in the url:

http://h5serv.readthedocs.io/en/latest/AclOps/index.html#root-acl-inheritance

So to summarise, for those reading who might have the same problem, setting the default root ACL looks like this:

PUT /acls/default HTTP/1.1
Host: kermit:5000
Content-Type: application/json
Cache-Control: no-cache
Postman-Token: aa2a10c2-6216-84cd-5702-3bcbf434159a

{"perm": {"read": true, "create": false, "update": false, "delete": false, "readACL": false, "updateACL": false}}

Thanks

jreadey commented 7 years ago

Are you using the master branch or develop? I made a change in the interface last feb that removed the "perm" wrapper: https://github.com/HDFGroup/h5serv/commit/0552275a34b8e4d8c76f6ac254af35696779196f.

Anyway, I'd recommend using the dev branch at least until I can get a more recent merge to master out.

For the site-wide acl, would it be adequate to have a server-wide setting for default access?

Currently the behavior for any file that doesn't have a default ACL is defined here: https://github.com/HDFGroup/hdf5-json/blob/develop/h5json/hdf5db.py. I.e. the default is that anyone can do any action.

It should be easy enough to have a config value that defines what permissions should be applied in this case.

jreadey commented 7 years ago

Nathan,

The “activate” is part of the Anaconda package, as described in the install doc: http://h5serv.readthedocs.io/en/latest/Installation/ServerSetup.html#installing-on-linux-mac-os-x.

There are other ways to manage Python packages, but Anaconda seems the most popular and easiest to deal with.  Using “activate” you can switch between different Python versions and different sets of dependent packages that each project may need.

The “acltest” is included in testall.py in the develop branch.  Can you try again with the develop branch (and Anaconda if you are not using that), so we can get on the same place?  All tests should be passing. If not, it’s likely due to some variation in the dependencies that we’ll want to examine.

John

From: Nathan Johnson notifications@github.com Reply-To: HDFGroup/h5serv reply@reply.github.com Date: Wednesday, June 21, 2017 at 2:12 AM To: HDFGroup/h5serv h5serv@noreply.github.com Cc: John Readey jreadey@hdfgroup.org, Mention mention@noreply.github.com Subject: Re: [HDFGroup/h5serv] Site-wide ACL (#105)

Hi John, thanks for getting back to me.

I think the json encoding error is a red herring related to telnet, as it validated through postman just fine.

I was already looking at the acltest.py script to try and get a handle on what the problem is, but alas it fails for me, so there is something fundamentally wrong with my set up. I stopped using the prebuilt docker image as it was using v0.1.0 and rebuilt v0.2.0 from the Dockerfile in the git repo. I then ran the test suite, although I'm a bit puzzled by this line in the docs: · source activate h5serv (just: activate h5serv on Windows)

The only 'activate script I can find in the container is: · /usr/local/lib/python3.5/venv/scripts/posix/activate

These are the errors I get from acltest.py and then testall.py (as this doesn't seem to run the acltest.py):

python integ/acltest.py FFFFFFFFF FAIL: testAttributes (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 155, in testAttributes self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testDataset (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 202, in testDataset self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testDatatypes (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 311, in testDatatypes self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testGetDomainAcls (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 100, in testGetDomainAcls self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testGetDomainDefaultAcls (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 93, in testGetDomainDefaultAcls self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGroups (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 370, in testGroups self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testPutDomain (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 149, in testPutDomain self.assertEqual(rsp.status_code, 201) AssertionError: 403 != 201

====================================================================== FAIL: testRoot (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 468, in testRoot self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true

====================================================================== FAIL: testValue (main.AclTest)

Traceback (most recent call last): File "integ/acltest.py", line 262, in testValue self.setupAcls() File "integ/acltest.py", line 55, in setupAcls self.assertTrue(helper.validateId(rootUUID)) AssertionError: False is not true


Ran 9 tests in 0.123s

FAILED (failures=9)

python testall.py ... FAIL: testDelete (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 167, in testDelete self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testDeleteNotFound (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 182, in testDeleteNotFound self.assertEqual(rsp.status_code, 404) AssertionError: 403 != 404

====================================================================== FAIL: testDeleteSubSubdomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 189, in testDeleteSubSubdomain self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testDomainWithSpaces (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 132, in testDomainWithSpaces self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGetDomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 38, in testGetDomain self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGetDomainWithDot (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 211, in testGetDomainWithDot self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGetNotFound (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 79, in testGetNotFound self.assertEqual(rsp.status_code, 404) AssertionError: 403 != 404

====================================================================== FAIL: testGetReadOnly (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 58, in testGetReadOnly self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGetSubdomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 140, in testGetSubdomain self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testGetToc (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 69, in testGetToc self.assertEqual(rsp.status_code, 200) AssertionError: 403 != 200

====================================================================== FAIL: testInvalidDomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 99, in testInvalidDomain self.assertEqual(rsp.status_code, 400) # 400 == bad syntax AssertionError: 403 != 400

====================================================================== FAIL: testPut (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 197, in testPut self.assertEqual(rsp.status_code, 201) AssertionError: 403 != 201

====================================================================== FAIL: testPutNameWithDot (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 231, in testPutNameWithDot self.assertEqual(rsp.status_code, 201) AssertionError: 403 != 201

====================================================================== FAIL: testPutSubSubdomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 156, in testPutSubSubdomain self.assertEqual(rsp.status_code, 201) AssertionError: 403 != 201

====================================================================== FAIL: testPutSubdomain (main.RootTest)

Traceback (most recent call last): File "roottest.py", line 148, in testPutSubdomain self.assertEqual(rsp.status_code, 201) AssertionError: 403 != 201


Ran 18 tests in 0.100s

FAILED (failures=15)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/HDFGroup/h5serv/issues/105#issuecomment-310018385, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHbMFDMX6XmnSpRNuXO6fjc6bOYDiYyPks5sGN51gaJpZM4LGBUA.

swingingsimian commented 7 years ago

Hi John

Originally I picked up the docker image from dockerhub, which is still on v0.1.0, assuming an official release tag was the right thing to use I rebuilt this from the github Dockerfile to v0.2.0. I think I was suffering from a mismatch between the version in readthedocs and what I built. This is also why I am not using conda, as this is not present in the Dockerfile.

The test issue turned out to be due to a custom password_uri config. I think testall.py assumes the default password_uri is still being used. This caused main.RootTest to fail and it bailed out before it got to AclTest. Reverting back to the default password_uri resolved this.

I rebuild using 'develop' remove the 'perm' key requirement.

Thanks for the clarification

Nathan

jreadey commented 7 years ago

Ok, great. When I get time I'll do a merge to master and push out a new docker image.

Correct the testall.py assumes the default password. It would be useful to have a sub-set of tests that can rely on just read-access to verify deployments, but it would be a bit of work.

I'm closing this issue now. Let's use #115 to track the issue of blocking unauthorized domain creation.