equinor / segyio

Fast Python library for SEGY files.
Other
490 stars 218 forks source link

trace header Field.buf always big endian? #559

Closed anthonytorlucci closed 1 year ago

anthonytorlucci commented 1 year ago

Is the raw bytes buf variable always big endian regardless of the byteorder of the input segy data?

I've created an example segy_header_buf to illustrate the issue.

The example below works as expected when the input is big endian byteorder.

# --- segyio read back
with segyio.open(sgy_file_out, mode='r', ignore_geometry=True, endian='big') as segy_handle:

    # Memory map file for faster reading (especially if file is big...)
    # segy_handle.mmap()

    start = 0
    stop = 5
    block_headers = [segy_handle.header[trc_idx] for trc_idx in range(start, stop)]
    for hdr in block_headers[:3]:
        print(hdr[5], hdr[9])

    tmp = block_headers[0]
    print(type(tmp))  # <class 'segyio.field.Field'>
    # print(tmp)
    tmp_buf = tmp.buf  # 240 bytes trace header
    print(len(tmp_buf))
    print(struct.unpack_from('>i', tmp_buf, offset=4))
    print(int.from_bytes(tmp_buf[4:8], byteorder='big', signed=False))

However, when the input segy is little endian (and switching the signs in the unpacking) the output is not correct.

anthonytorlucci commented 1 year ago

Excerpt from discussion on Slack

segyio exposes the buffer as big-endian always so consumers can read without checking if they need to swap It's on purpose to support custom header parsing, but I think it's a bad design because if you do custom parsing it is likely your fields do not align with the spec