Open percurnicus opened 6 years ago
(not sure why those commits carried over but they can be disregarded) See the note on tofile, I think it explains why we are having a problem
The following code should fix the problem:
# Set label information about the image so it can be properly opened
# the label information will inform other software about where the
# label is and where the actual image starts, and the data type of the
# image
# We need the byte order the image is calibrated in
# (little endianess vs large endianess)
byteorder = self.image.dtype.byteorder
if byteorder == '=':
if sys.byteorder == 'little':
byteorder = '<'
else:
byteorder = '>'
# planetaryimage/pds3image.py for the DTYPES dictionary
# SAMPLE_TYPE will either be 'MSB_INTEGER' if the byteorder is '>'
# (little endianess) or 'LSB_INTEGER' if the byter order is '<'
# (large endianess)
self.label['IMAGE']['SAMPLE_TYPE'] = self.DTYPES[byteorder + 'i']
# FIGURE OUT DETERMINING SAMPLE_BITS
# Probably just self.image.dtype.itemsize * 8
# Determine label padding. See https://go.nasa.gov/2MSA4ae (Chapter 5
# of the PDS3 Standards Reference) for a more in depth explanation of
# these fields.
# Get the total bytes of the label which is needed to determine the
# total number of records the label needs
label_bytes = len(pvl.dumps(self.label))
# Use the record bytes from the label - that should remain the same
# "The RECORD_BYTES data element identifies the number of bytes in each
# physical record in the data product file"
record_bytes = self.label['RECORD_BYTES']
# "The LABEL_RECORDS data element identifies the number of physical
# records that make up the PDS product label"
# We take the ceiling of label btyes / record bytes so that label
# records is a integer multiple of record bytes.
label_records = int(np.ceil(label_bytes / float(record_bytes)))
self.label['LABEL_RECORDS'] = label_records
# ^IMAGE is a pointer that states which record number the actual image
# starts. "Data object pointers in attached labels take one of two
# forms: ^<object_identifier> = nnn where nnn represents the starting
# record number within the file (first record is numbered 1), or
# ^<object_identifier> = nnn <BYTES> where nnn represents the starting
# byte location within the file (first byte is numbered 1)". Here we
# use record numbers
self.label['^IMAGE'] = label_records + 1
# Changing the above information can change the total number of bytes
# of the label so we have to get the label bytes again
# TODO: Figure out why and figure out how this could be an edge case
# that breaks saving to a file
label_bytes = len(pvl.dumps(self.label))
# We need to pad the label with new lines so the image will actually
# start at the byte recorded in the label above
label_padding = int((label_records) * (record_bytes) - label_bytes)
# Save the image with the attached label.
# For a non-gzipped image it we:
# 1) Write the label out using pvl.dump
# 2) Pad the label so the image starts where the label says
# 3) Write the file out
# For gzipped images we:
# 1) Perform the steps for a non-gzipped image on a temporary file
# 2) Gzip compress that file to the given output
# 3) Remove the temporary image
# Ideally we don't create a temporary image. My attempts to write to
# an opened gzipped file directly have failed.
outpath = os.path.join(outdir_name, outfile)
if nogzip:
with open(outpath, 'wb') as stream:
pvl.dump(self.label, stream)
stream.write(b'\n' * label_padding)
self.image.tofile(stream)
else:
temp = None
try:
_, temp = tempfile.mkstemp(suffix='.img')
with open(temp, 'wb') as f:
pvl.dump(self.label, f)
f.write(b"\n" * label_padding)
self.image.tofile(f)
with open(temp, 'rb') as f:
with gzip.open(outpath, 'wb') as g:
shutil.copyfileobj(f, g)
except Exception:
raise
finally:
if temp is not None and os.path.exists(temp):
os.remove(temp)
Fixes #71
Arrays are not saved correctly causing the data to not be what is expected when reopening a newly saved image. This PR aims to fix that bug