HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
111 stars 39 forks source link

handle 413 errors in point selection #48

Open mikejiang opened 6 years ago

mikejiang commented 6 years ago
coords[1:3]
Out[100]: [(441, 82852), (441, 88209)]
len(coords)
Out[101]: 2500

data = ds_remote[coords]
Traceback (most recent call last):
  File "/home/wjiang2/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-99-6ac77acf88d2>", line 1, in <module>
    data = ds_remote[coords]
  File "/home/wjiang2/.local/lib/python3.6/site-packages/h5pyd/_hl/dataset.py", line 848, in __getitem__
    rsp = self.POST(req, body=body)
  File "/home/wjiang2/.local/lib/python3.6/site-packages/h5pyd/_hl/base.py", line 477, in POST
    raise IOError(rsp.reason)
OSError: Request Entity Too Large
mikejiang commented 6 years ago

Mike Jiang
[8:38] 
`coords` is tuple list for 2.5k points, this occurred when I am trying to implement my own `fancy slicing` (`ds[idx1, idx2]`) by translating it into coordinates-based `point selection`

jreadey [8:43 PM] 
Right the server gives a 413 error when a selection hits too many chunks.

[8:44] 
Basically the server wants to ensure that it can respond to the request within a reasonable time.

[8:45] 
Would it be possible to restructure your query so that it consists of multiple requests where each request hits fewer chunks?

[8:46] 
In h5pyd I do this automatically for hyperslab selections.  See: https://github.com/HDFGroup/h5pyd/blob/master/h5pyd/_hl/dataset.py lines 718-800.
GitHub
HDFGroup/h5pyd
h5pyd - h5py distributed - Python client library for HDF Rest API

[8:53] 
Shouldn't be a problem to do the same kind of logic in h5pyd for point selection as is there now for hyperslab selection.