sixty-north / segpy

A Python package for reading and writing SEG Y files.
Other
99 stars 54 forks source link

Cannot read Seisware SEG-Y #53

Open rob-smallshire opened 6 years ago

rob-smallshire commented 6 years ago

This issue received by email from a geophysicist in Calgary:

I am a geophysicist out of Calgary and use I use SeisWare. For some reason, I cannot seem to load seisware segy. I was hoping you might be able to point me in the right direction.

image

rob-smallshire commented 6 years ago

Be aware that the example programs aren't ported to, or tested on, Windows, which is why you get two errors - one caused by the SeisWare file problem and another because the program expects to be running on a Linux or MacOS. The latter is trivial to fix – just don't use the os.EX_* constants. See https://github.com/sixty-north/segpy/issues/17

rob-smallshire commented 6 years ago

From the user:

I attached 2 files, one is in seisware segy and the other I exported from seisware as generic segy.

seisware_segy.zip

rob-smallshire commented 6 years ago

I took a look at your files.

First of all, they appear to be in little-endian format (though I might be wrong about this). The example code you were using is hard-wired to big-endian (the SEG Y standard) so you need to pass the correct endian parameter to the call to create_reader().

Second, the reason Segpy complains about unknown SEG Y Revision 29559 is because that is the revision recorded in your in your file.

According the to SEG Y standard, bytes 3501-3502 contain the revision number (note these byte offsets are using Fortran-style 1-based indexing).

screen shot 2017-08-02 at 12 33 24

In the file you sent me (different to the one above), the bytes at offset 3500-3501 (using 0-based indexing) contain 0x7465 in hexadecimal which is either 29797 in big-endian decimal or 25972 in little-endian decimal:

screen shot 2017-08-02 at 12 35 50

Neither of these are legitimate SEG Y revision numbers. As you can see from the standard, legitimate values are 0x0100 for SEG Y Revision 1 or 0x0000 for SEG Y revision 0. I can't make sense of the 0x7465 bytes by interpreting them as EBCDIC or ASCII either, so I'm not sure what Seisware is using this mandatory field for. A question for them, perhaps?

rob-smallshire commented 6 years ago

From the user:

I am not sure why they made the changes to the SEGY standard but it is what users of Seisware are stuck with.

By endian do you mean IBM vs IEEE format? Can you give me an example of how to add this to the create_reader() call?

Is the 3501 reference to the file or trace header position?

I know they also put one line of ascii to the bottom of the ebcdic header, again don't know why.

rob-smallshire commented 6 years ago

Endian refers to the byte order within multi-byte integers and whether the most-significant byte is first, or the least significant byte is first. https://en.wikipedia.org/wiki/Endianness The endianness of SEG-Y files is supposed to be big-endian (most significant first) – at least in SEG Y revisions 0 and 1. Endianness is a separate issue from the floating-point format which is IBM or IEEE.

You can modify the example to expect little-endian data, like this:

segy_reader = create_reader(in_file, endian='<')

but be aware that is probably still won't work because of the odd value in the SEG-Y revision number field of the binary header.

The 3501 is the byte offset in the file. As you can see in the attached cheat-sheet, this offset falls squarely in the middle of the binary file header: https://dl.dropboxusercontent.com/u/14965965/SEGY_revisions.pdf

I recommend you ask the Seisware folks to explain why their files are the way they are.

Segpy is engineered very much as a kit of parts, and although it tries to do the right thing by default, this doesn't work when vendors stray too far from the standard, as appears to have happened here. It will be possible to use some of Segpy's lower-level components to construct a reader which works with Seisware data, and this is something we could undertake do on a contract basis if you don't have the capabilities in-house to work with the open source code.

rob-smallshire commented 6 years ago

One potential solution to this is to allow the client to specify a custom binary file header format to the create_reader() call, in much the same way as we allow them to specify a custom trace header format. Essentially this would entail adding a binary_file_header_format argument to create_reader().

Another approach would be to just make a create_seisware_reader() function which utilises the lower-level toolkit functions.

Either way, it's a non-trivial amount of investigative and/or engineering work to built this capability to support Seisware SEG-Y.

rob-smallshire commented 6 years ago

Intriguingly, the bytes starting at 0-based offset 3500, where you would expect to find the SEG-Y version number are:

74 65 73 74 53 57

which decode in ASCII as:

 testSW

which I presume means "test software".

abingham commented 6 years ago

which I presume means "test software".

Or "test seisware". This is quite a rabbit hole!