equinor / segyio

Fast Python library for SEGY files.
Other
469 stars 213 forks source link

trace header Field.buf always big endian? #559

Closed anthonytorlucci closed 11 months ago

anthonytorlucci commented 11 months ago

Is the raw bytes buf variable always big endian regardless of the byteorder of the input segy data?

I've created an example segy_header_buf to illustrate the issue.

The example below works as expected when the input is big endian byteorder.

# --- segyio read back
with segyio.open(sgy_file_out, mode='r', ignore_geometry=True, endian='big') as segy_handle:

    # Memory map file for faster reading (especially if file is big...)
    # segy_handle.mmap()

    start = 0
    stop = 5
    block_headers = [segy_handle.header[trc_idx] for trc_idx in range(start, stop)]
    for hdr in block_headers[:3]:
        print(hdr[5], hdr[9])

    tmp = block_headers[0]
    print(type(tmp))  # <class 'segyio.field.Field'>
    # print(tmp)
    tmp_buf = tmp.buf  # 240 bytes trace header
    print(len(tmp_buf))
    print(struct.unpack_from('>i', tmp_buf, offset=4))
    print(int.from_bytes(tmp_buf[4:8], byteorder='big', signed=False))

However, when the input segy is little endian (and switching the signs in the unpacking) the output is not correct.

anthonytorlucci commented 11 months ago

Excerpt from discussion on Slack

segyio exposes the buffer as big-endian always so consumers can read without checking if they need to swap It's on purpose to support custom header parsing, but I think it's a bad design because if you do custom parsing it is likely your fields do not align with the spec