Closed erblanm closed 1 week ago
For changing the dtype, you can look into it, but this is not mandatory if too difficult. As the ouput is a binary mask, I can see several possibilities:
So I am testing the rastertools hillshade on DEM which contains np.NaN values and it seems rastertools is already handling NaN data values as a 0 value so it won't be a problem to set dtype to a different type than Float type. I tried setting binary type but as we are working with numpy array, the lowest type would be np.int8 or np.uint8.
So I initialized data type hillshade output to np.int8 and it worked properly with a gain of ~0.25MB compared to Float dtype
Isn't dtype=bool working?
@queyruto, we are almost there, I would like to have your opinion about this change.
This looks good to me, but let's wait a bit for @queyruto if he can.
During this time, please make sure tests are OK with those changes as we have no CI currently.
Sorry for my late answer. Not sure I exactly get the difference between processing_dtype
and dtype
.
If dtype
is the dtype of the output date and processing_dtype
is the dtype of the processed input data, why not naming them "out_dtype" and "in_dtype" ? I agree that it is not a major point ;)
Everything else sounds perfect!
@queyruto, this would probably be clearer, but we didn't want to rename the dtype
attribute which was already used. So I propose to use in_dtype
and dtype
if you prefer, to minimize changes and potential issues or unexpected errors.
But if you think we should modify dtype
and rename it as out_dtype
, we can also do it.
@queyruto I'm merging because we need to validate this for the next upcoming Let It snow version. Feel free to open an issue if you'd like to modify some changes!
Add the lwz compression to hillshade final output :
For the given DEM 25WFS with customized parameters, rastertools output size without compression is 151MB rastertools output size with lzw compression is 1.1MB --> Results look exactly the same, huge memory space is freed. --> No negative impact known.
I tests a dtype = np.uint8 (instead of default np.float32), rastertools output size with lzw compression is 0.77MB --> Results look exactly the same, a bit more space could be freed. --> But as np.nan - which is typed Float - can be encountered in our data, it requires further research to implement a change in dtype.