Small-Bodies-Node / pds4_tools

Python package to read and display NASA PDS4 data.
17 stars 12 forks source link

Remove use of integer overflow when applying scaling/offset #98

Closed LevN0 closed 1 month ago

LevN0 commented 3 months ago

In pds4_tools.reader.data_types.adjust_array_data_type, there can be an integer overflow on scaling / offset application when the scaled data is smaller than the initial data, through the fact that the array is converted to a new data type prior to the scaling (at which point it doesn't yet fit into the new data type). As far as I can see, since the only use of this function is in pds4_tools.reader.data_types.apply_scaling_and_value_offset, which then applies an operation that cancels out the overflow, thus it didn't actually cause invalid values (at least in the test suite) but I am not absolutely sure if that's generally true.

In this fix, while also fixing the bug in the prior code, I go beyond the protection the prior code was supposed to have, modifying pds4_tools.reader.data_types.get_scaled_numpy_type to more explicitly include the include_unscaled argument that will not cast down integers if the initial data does not fit.

Additionally, I improve memory efficiency at the cost of CPU / run time by casting integers down after the scaling / offset is applied.