trhallam / segysak

SEGY Swiss Army Knife for Seismic Data
https://trhallam.github.io/segysak/
GNU General Public License v3.0
97 stars 34 forks source link

Segysak 0.5 cannot find engine with xr.open_dataset #130

Closed khaled1240274 closed 4 months ago

khaled1240274 commented 4 months ago

I am facing the following error

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'pydap', 'sgy_engine', 'zarr']. Consider explicitly selecting one of the installed engines via the engine parameter, or installing additional IO dependencies, see: https://docs.xarray.dev/en/stable/getting-started-guide/installing.html https://docs.xarray.dev/en/stable/user-guide/io.html

Whe running the following code: segy_file = pathlib.Path("NORMAL_POLARITY.segy") ds = xr.open_dataset( segy_file, dim_byte_fields={'iline':189, 'xline':193}, extra_byte_fields={'cdp_x':181, 'cdp_y':185}, )

I am using segysak 0.5 and xarray 2024.3.0

trhallam commented 4 months ago

Hi, Xarray chooses a backend by looking at the file extension of the input. SEGY-SAK recognises .sgy and .segy. Can you please check that the segy_file argument to open_dataset has these extensions. If it does not, you must provide the engine manually.

 xr.open_dataset(file, engine='sgy_engine', dim_byte_fields={'iline':189, 'xline':193}, extra_byte_fields={'cdp_x':181, 'cdp_y':185})
khaled1240274 commented 4 months ago

Thanks a lot. After explicitly selecting the 'engine = 'sgy_engine', everythinge worked fine.

khaled1240274 commented 4 months ago

The new version 0.5 reads the cube much faster than 0.4. For the same cube ver 0.4 used to take about 17 min while ver 0.5 takes less than 30 seconds.

trhallam commented 4 months ago

The new version 0.5 reads the cube much faster than 0.4. For the same cube ver 0.4 used to take about 17 min while ver 0.5 takes less than 30 seconds.

This is because the new engine no longer greedily loads the seismic data and optimisations are made on the header scan. There may be a delay if you try to access the entire volume at once as it is loaded into memory.