TUW-GEO / ascat

Read and visualize data from the Advanced Scatterometer (ASCAT) on-board the series of Metop satellites
https://ascat.readthedocs.io/
MIT License
23 stars 16 forks source link

Fix dtype nan irregularities in EPS/BUFR readers #66

Closed claytharrison closed 1 month ago

claytharrison commented 2 months ago

Should fix #65, but not sure if these changes are just fixing a mistake or if the current status is intentional and these changes would disrupt something else down the line? @sebhahn

sebhahn commented 2 months ago

@claytharrison int8 would be the better datatype in this case (since it is not a bitfield flag), would it be possible to stick with it and fix the casting/nan? It seems to work with other variables stored as int8 dtype

sebhahn commented 2 months ago

A boolean datatype is interpreted as a unsigned int https://github.com/TUW-GEO/ascat/blob/master/src/ascat/read_native/eps_native.py#L727

but later we treat it as (or cast it to) signed int

Edit: perhaps try adding a cast here "swath_indicator": ("swath_indicator", np.int8, int_nan),

claytharrison commented 2 months ago

If I understand the code correctly, this cast happens after

Edit: perhaps try adding a cast here

If I understand the code correctly, if I add this cast, it would occur after the errors mentioned in the issue. In that case I believe making both changes, as well as adding another cast here would produce the desired result?

claytharrison commented 2 months ago

Also this code seems strange to me: https://github.com/TUW-GEO/ascat/blob/e5c02d4ccc08c59ca9057bfc148a40f0a098f792/src/ascat/read_native/eps_native.py#L834-L837

We specify a new name, a new dtype, and the current nan value for each generic field. Then we pop the old name from data, convert it to the new dtype and assign it to the new name. Then, if the current nan value is not None, we find all current nans in the array and set them to float32_nan? Shouldn't the new nan value depend on the new dtype?

sebhahn commented 2 months ago

you are right, can you please try if this fixes the issue?

claytharrison commented 1 month ago

Working on it - out of the following integer vars, which should end up as uint8 and which should be int8?

swath_indicator

f_kp

f_usable

flagfield_gen2

claytharrison commented 1 month ago

OK, I think I deduced the answer to the last question (f_usable and swath_indicator as int8, f_kp as uint8, and flagfield_gen2 doesn't end up in final output anyway). I think I've sorted everything now assuming that's correct.

claytharrison commented 1 month ago

OK I'm actually done with this now from my end unless there are other comments :+1:

sebhahn commented 1 month ago

thanks!