kanglcn / moraine

Modern Radar Interferometry Environment; A simple, stupid InSAR postprocessing tool in big data era
https://kanglcn.github.io/moraine/
Other
10 stars 7 forks source link

GPUdirect #7

Open kanglcn opened 1 year ago

kanglcn commented 1 year ago

Use https://github.com/rapidsai/kvikio when it is mature.

beckernick commented 1 year ago

Hi @kanglcn ! I came across this issue due to the rapidsai/kvikio reference. I work on the RAPIDS team at NVIDIA.

If you're willing to share, I'd love to learn more about what potential KvikIO features and functionality would be important for your use cases.

kanglcn commented 1 year ago

Hi @beckernick . It looks I can't install kvikio with python 3.10.

mamba install -c rapidsai kvikio=23.02 

Looking for: ['kvikio=23.02']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
pkgs/main/linux-64                                            No change
pkgs/main/noarch                                              No change
pkgs/r/linux-64                                               No change
rapidsai/linux-64                                             No change
rapidsai/noarch                                               No change
pkgs/r/noarch                                                 No change

Pinned packages:
  - python 3.10.*

Could not solve for environment specs
The following packages are incompatible
└─ kvikio 23.02**  is installable with the potential options
   ├─ kvikio 23.02.00 would require
   │  └─ python >=3.8,<3.9.0a0 , which can be installed;
   └─ kvikio 23.02.00 would require
      └─ python >=3.9,<3.10.0a0 , which can be installed.
beckernick commented 1 year ago

We've just released v23.04, which includes Python 3.10 support. Would you be able to give that a test?

kanglcn commented 1 year ago

Thank you @beckernick ! I have successfully installed it.

I am working on satellite image processings. My current workflow is:

The data reading and writing is too slow. That is why I am looking for kvikio.

After I install it, I find a problem in reading zarr:

rslc_path = '../../data/rslc.zarr'
rslc_zarr = zarr.open(rslc_path,mode='r')
rslc_cpu = rslc_zarr[:]

The data is successfully load into memory. But when I use kvikio:

rslc_zarr = zarr.open(store=GDSStore('./rslc_gpu.zarr'),mode='r')
rslc_gpu = rslc_zarr[:]

I got an error:

Click me ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[37], line 1 ----> 1 rslc_gpu = rslc_zarr[:] File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:821, in Array.__getitem__(self, selection) 819 result = self.vindex[selection] 820 else: --> 821 result = self.get_basic_selection(pure_selection, fields=fields) 822 return result File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields) 944 return self._get_basic_selection_zd(selection=selection, out=out, 945 fields=fields) 946 else: --> 947 return self._get_basic_selection_nd(selection=selection, out=out, 948 fields=fields) File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields) 984 def _get_basic_selection_nd(self, selection, out=None, fields=None): 985 # implementation of basic selection for array with at least one dimension 986 987 # setup indexer 988 indexer = BasicIndexer(selection, self) --> 990 return self._get_selection(indexer=indexer, out=out, fields=fields) File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:1285, in Array._get_selection(self, indexer, out, fields) 1275 if ( 1276 not hasattr(self.chunk_store, "getitems") and not ( 1277 hasattr(self.chunk_store, "get_partial_values") and (...) 1280 ) or any(map(lambda x: x == 0, self.shape)): 1281 # sequentially get one key at a time from storage 1282 for chunk_coords, chunk_selection, out_selection in indexer: 1283 1284 # load chunk selection into output array -> 1285 self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection, 1286 drop_axes=indexer.drop_axes, fields=fields) 1287 else: 1288 # allow storage to get multiple items at once 1289 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer) File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:2006, in Array._chunk_getitem(self, chunk_coords, chunk_selection, out, out_selection, drop_axes, fields) 2003 out[out_selection] = fill_value 2005 else: -> 2006 self._process_chunk(out, cdata, chunk_selection, drop_axes, 2007 out_is_ndarray, fields, out_selection) File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:1959, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode) 1956 tmp = np.squeeze(tmp, axis=drop_axes) 1958 # store selected data in output -> 1959 out[out_selection] = tmp File cupy/_core/core.pyx:1473, in cupy._core.core._ndarray_base.__array__() TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly. ```

Can you please help me find out how to correctly use kvikio?

madsbk commented 1 year ago

Hi @kanglcn, the GPU array support in Zarr is still in development. we just merged the final piece, which will be included in the next Zarr release v2.15.

We need to make KvikIO use this new Zarr feature and then everything should just work hopefully :)

Are you using compression? Zarr only comes with CPU compressions but we plan to implement GPU compression using NVCOMP.

kanglcn commented 1 year ago

Thanks @madsbk for letting me know! I haven't used any compression now. But I definitely will try it if it can help speed up the IO.

Would you consider adding support for dask? i.e. dask.array.to_zarr and dask.array.from_zarr. I am scaling my code with dask. If you have that plan, that will be very helpful to me!

madsbk commented 1 year ago

Yes, the plan is to support dask.