WIP: Fix bugs in saving new PDS3 file

percurnicus commented 6 years ago

Fixes #71

Arrays are not saved correctly causing the data to not be what is expected when reopening a newly saved image. This PR aims to fix that bug

percurnicus commented 6 years ago

(not sure why those commits carried over but they can be disregarded) See the note on tofile, I think it explains why we are having a problem

percurnicus commented 5 years ago

The following code should fix the problem:

# Set label information about the image so it can be properly opened
# the label information will inform other software about where the
# label is and where the actual image starts, and the data type of the
# image

# We need the byte order the image is calibrated in
# (little endianess vs large endianess)
byteorder = self.image.dtype.byteorder
if byteorder == '=':
    if sys.byteorder == 'little':
        byteorder = '<'
    else:
        byteorder = '>'

# planetaryimage/pds3image.py for the DTYPES dictionary
# SAMPLE_TYPE will either be 'MSB_INTEGER' if the byteorder is '>'
# (little endianess) or 'LSB_INTEGER' if the byter order is '<'
# (large endianess)
self.label['IMAGE']['SAMPLE_TYPE'] = self.DTYPES[byteorder + 'i']

# FIGURE OUT DETERMINING SAMPLE_BITS
# Probably just self.image.dtype.itemsize * 8

# Determine label padding. See https://go.nasa.gov/2MSA4ae (Chapter 5
# of the PDS3 Standards Reference) for a more in depth explanation of
# these fields.

# Get the total bytes of the label which is needed to determine the
# total number of records the label needs
label_bytes = len(pvl.dumps(self.label))

# Use the record bytes from the label - that should remain the same
# "The RECORD_BYTES data element identifies the number of bytes in each
# physical record in the data product file"
record_bytes = self.label['RECORD_BYTES']

# "The LABEL_RECORDS data element identifies the number of physical
# records that make up the PDS product label"
# We take the ceiling of label btyes / record bytes so that label
# records is a integer multiple of record bytes.
label_records = int(np.ceil(label_bytes / float(record_bytes)))
self.label['LABEL_RECORDS'] = label_records

# ^IMAGE is a pointer that states which record number the actual image
# starts. "Data object pointers in attached labels take one of two
# forms:  ^<object_identifier> = nnn where nnn represents the starting
# record number within the file (first record is numbered 1), or
# ^<object_identifier> = nnn <BYTES> where nnn represents the starting
# byte location within the file (first byte is numbered 1)". Here we
# use record numbers
self.label['^IMAGE'] = label_records + 1

# Changing the above information can change the total number of bytes
# of the label so we have to get the label bytes again
# TODO: Figure out why and figure out how this could be an edge case
# that breaks saving to a file
label_bytes = len(pvl.dumps(self.label))

# We need to pad the label with new lines so the image will actually
# start at the byte recorded in the label above
label_padding = int((label_records) * (record_bytes) - label_bytes)

# Save the image with the attached label.
# For a non-gzipped image it we:
#     1) Write the label out using pvl.dump
#     2) Pad the label so the image starts where the label says
#     3) Write the file out
# For gzipped images we:
#     1) Perform the steps for a non-gzipped image on a temporary file
#     2) Gzip compress that file to the given output
#     3) Remove the temporary image
# Ideally we don't create a temporary image. My attempts to write to
# an opened gzipped file directly have failed.

outpath = os.path.join(outdir_name, outfile)

if nogzip:
    with open(outpath, 'wb') as stream:
        pvl.dump(self.label, stream)
        stream.write(b'\n' * label_padding)
        self.image.tofile(stream)
else:
    temp = None
    try:
        _, temp = tempfile.mkstemp(suffix='.img')
        with open(temp, 'wb') as f:
            pvl.dump(self.label, f)
            f.write(b"\n" * label_padding)
            self.image.tofile(f)
        with open(temp, 'rb') as f:
            with gzip.open(outpath, 'wb') as g:
                shutil.copyfileobj(f, g)
    except Exception:
        raise
    finally:
        if temp is not None and os.path.exists(temp):
            os.remove(temp)

planetarypy / planetaryimage

WIP: Fix bugs in saving new PDS3 file #74