actris-cloudnet / cloudnetpy

Python package for Cloudnet data processing
MIT License
34 stars 26 forks source link

Extend sources serial number functionality for different instruments and/or via meta dict #91

Open spirrobe opened 1 year ago

spirrobe commented 1 year ago

general idea

When operating several different devices on the same site and looking at the output it is useful to know which cloudnetfile (especially cat/class/+ files) contains which devices. (We operate 3 radars on one campaign site - and technically 2 MWRs as well as 2 LIDARs, each of the same type).

serial number from raw files

So far, some cloudnet processing steps (e.g. ceilo2nc) with certain instrument types (CHM15k for example) contain the serial number of the device as taken from the netCDF files. This would be a possibility for the RPG HATPRO (MWR) nc types which contain the attribute "Serial_Number" (with a leading space in the string value that needs to be removed). Where available this can be included (this might depend on device + manufacturer software version as well). Notably, this information is not present in the RPG HATPRO binary files.

Where not available, this information could/should be passed via the site_meta dict, with the dict taking precedence over the ncfile attributes.

sources attribute

The cat/class/+ files (usually - at least as far as i've seen) contain a global sources attribute for the device combination (4 sources in general). This can/should be extended to the single variables, resp. their attributes in the files. After all, Z in the cat file will be from the radar and the global source includes all instruments. Some variable already do contain the source attribute in the current version. As such, the global source attribute can be partially added depending on the field type· This allows with #90 that the plot will contain exactly the information which lead to the creation of the specific variable. For class files, the global and variable specific source would then be equal. This means, there is a certain redundancy created with this approach, but the variables become more atomic and stand alone.

serial numbers attribute and expected addition

Similarly, the serial numbers should be available on the global level as well as for each variable. Again, this will create redundancy but makes it clearer which variable originated from what. e.g. ncdump -h categorize_file output

Z:source_serial_number = "RADARSN" ;
Z:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
v:source_serial_number = "RADARSN" ;
v:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
sldr:source_serial_number = "RADARSN" ;
sldr:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
beta:source_serial_number = "LIDARSN" ;
beta:source = "Lufft CHM15k" ;
lwp:source_serial_number = "MWRSN" ;
lwp:source = "RPG-Radiometer Physics HATPRO" ;

e.g. ncdump -h drizzle_file

        float mu(time, height) ;
                mu:_FillValue = 9.96921e+36f ;
                mu:units = "1" ;
                mu:source = "RPG-Radiometer Physics RPG-FMCW-94\n",
                        "Lufft CHM15k" ;
                mu:source_serial_number = "MWRSN\n",
                        "LIDARSN" ;
....

                :source_serial_numbers = "RADARSN",
                        "LIDARSN\n",
                        "MWRSN\n",
                        "" ;

looking for input/comments

roadmap / tasks (draft, unordered)

related

Plotting as of v1.55.0 / #90 supports both global and variable sources and serial numbers for figures (and checks for them) The gist cloudnet_add_serial_number_2_cloudnetfiles.py](https://gist.github.com/spirrobe/dde782662bda45feeaeacd15526f062f) is an example of how to postprocess some cloudnet nc files, adding global serial numbers, per variable serial numbers and where applicable per variable source

siiptuo commented 12 months ago

site_meta key would be 'serial_number'

Sounds good!

Would this create too much redundancy? In terms of file size, I think the few attrs are negligible

I think it's worth having rich and unambiguous metadata even if it adds some redundancy.

Serial number of model -> This would actually be the version of the model (or at least that is the closest information that would make sense) but I'm not sure this is readily available. So far I added an empty serial number for the model.

We have thought ways of identifying models in more detail, but for now it can be left empty.