HDFGroup / h5serv

Reference service implementation of the HDF5 REST API
Other
168 stars 35 forks source link

Docs - How do domains map to actual HDF5 files shoulld be addressed. #94

Closed sgpinkus closed 8 years ago

sgpinkus commented 8 years ago

Hi, I've run through the installation and got the server running. I can see my datapath set to ../data/, and I can see in there:

$ tree -L 2 data/
data/
├── home
│   ├── test_user1
│   └── test_user2
├── public
│   └── tall.h5
├── readme.txt
└── test
    ├── array_attr.h5
     ...

So now I'm trying to figure out which "domain" will get me to "tall.h5". I put some dots together and tried http://10.1.2.37:5000/?host=tall.data.hdfgroup.org but that didn't work (404).

The mapping between the ReST API and actual HDF5 files is not explained in the docs. It's pretty clear the authors are trying to avoid tying the API semantics to HDF5 file format but, as well as issues like above, it makes it confusing for people how already know HDF5.

Aside; Its interesting "domains" was chosen as the abstraction for representing what amounts to different HDF5 files (unless I'm mistaken). Initially the group jumps out as an obvious abstraction. So all resources just fall under a virtual group. Then you have issues like finding the boundary that corresponds to an actual HDF5 file boundary I guess, but you flag that with a virtual attribute or some such.

Cheers,

Sam.

sgpinkus commented 8 years ago

I figured out if I copy the files directly into the data dir h5serv will detect them and they are available at host=<h5name>.<domain>. Still would be great to doc this. Since I'm still figuring out things like can h5serv reach into those subdirs..

jreadey commented 8 years ago

I've updated the docs to make this more clear. Take a look at the docs here:

You are correct in that we didn't want to tie the API to the specifics of the way files are organized on the disk. Future implementations of the HDF REST API may use alternative storage mechanisms such as object based storage, distributed files servers, or database based systems.

It's an interesting idea to consider all the data hosted by the service as one large virtual file, but there are a few cases where this breaks down. For example external links include both a domain and an HDF5 path. Also, the current implementation makes it trivial to import additional files into the service (just copy the file into the data directory).

sgpinkus commented 8 years ago

Hi, yes the doc updates help sort it out. Thanks.