Additional information in cloudnet plots/figures

actris-cloudnet / cloudnetpy

Python package for Cloudnet data processing

MIT License

36 stars 27 forks source link

Additional information in cloudnet plots/figures #88

Closed spirrobe closed 1 year ago

spirrobe commented 1 year ago

Hi, it's me, again. :-)

Due to the creation of many cloudnet plots for side-by-side comparison with different radars, I started adding some extra information:

a copyright notice in the bottom left with the time of creation of the figure
A small info box from which device(s) the data in the plot are taken by looking at the source attribute in the netCDFs and when none is available to assume it's from all of them, i.e. the global source attribute. I would love to have the serial number of the device in there too, but it is not readily available from for example the RPG FMCWs and would require more changes

A caveat is the rather arbitrary hardcoded positioning that is a result of the change tight_layout implements on matplotlib figures and any better approach would be appreciated. Would you be interested in a PR adding this functionality? Both would be off by default in cloudnetpy/plotting/plotting.py to not change the current functionality.

Some example plots are attached for illustration 20230228_lwp 20230228_target_classification_detection_status 20230228_Z_lwp_beta

20230228_beta

siiptuo commented 1 year ago

Hi, it's me, again. :-)

Ideas and feedback are always welcome!

I would love to have the serial number of the device in there too, but it is not readily available from for example the RPG FMCWs and would require more changes

CloudnetPy tries to write serial_number global attribute where possible. This should be available in CL61, CHM15k and PollyXT.

In Cloudnet data portal we identify instruments with instrument PIDs such as https://hdl.handle.net/21.12132/3.90b1e5245b11487d. This is handled outside of CloudnetPy but instrument_pid global attribute can be found in files downloaded from the data portal.

Would you be interested in a PR adding this functionality? Both would be off by default in cloudnetpy/plotting/plotting.py to not change the current functionality.

Yes, these sound like useful additions! I agree these changes should be off by default as we're using the plotting routines in production.

spirrobe commented 1 year ago

Great, I'll start the PR tomorrow. I see the serial_number in the chm15k cloudnet file that we create but this does not get passed on to cat/class cloudnet files and as such gets "lost" along the way. I think it would be great to be able to add this via meta dict for those devices where the raw file does not have this info (or potentially to overwrite it) and then have this either

added to the source attr of the relevant timeseries of the cat/class file, e.g. chm1k SN to beta, mwr SN to lwp in the form " (SN: XXXX)" This would tie in nicely with this extension of plotting and add that information automatically without further plotting related PR
added as seperate attr which would keep type of device and SN seperate and more atomic, but would mean that I would add this again to the plotting when the SN is implemented. (my pref despite more work)
Other preference?

Would you approve of this in general, and which path would you prefer?

siiptuo commented 1 year ago

Yes, passing the instrument serial numbers in the metadata would be useful.

added as seperate attr which would keep type of device and SN seperate and more atomic, but would mean that I would add this again to the plotting when the SN is implemented. (my pref despite more work)

I think this would be the best way. It's a good idea to keep this machine readable. Storing this is NetCDF-4 Classic is a bit awkward but something like this could work:

:source = "RPG-Radiometer Physics RPG-FMCW-94\n",
    "Lufft CHM15k\n",
    "RPG-Radiometer Physics HATPRO\n",
    "ECMWF Integrated Forecast System (IFS)" ;
:source_serial_numbers = "123123\n",
    "129391293\n",
    "\n",
    "" ;

spirrobe commented 1 year ago

Follow-up to be consistent before PR with naming and for the planned possibility to pass in SN via the meta-dict:

Do you want source_serial_number**s** or source_serial_number? Especially looking at the cat files, they contain a source attr (usually one device) for many variables, most other files do only keep global ones. So far I based the adjustment to plots as follows:

if available, take the source attr of the variable, then the source_serial_number and if the source_serial_number is not available do not look up the global one.
if not, take the global source attr, then look at the global source_serial_numbers

Personally, I think source_serial_number is better for the single variable and source_serial_numbers for the global one, but one name would be simpler for others I presume.

siiptuo commented 1 year ago

a source attr (usually one device) for many variables

Forgot that source can also be in the variable attribute.

Personally, I think source_serial_number is better for the single variable and source_serial_numbers for the global one.

Sound good as it matches file_uuid and source_file_uuids attributes.

spirrobe commented 1 year ago

I took the liberty and extended this PR (#90) a bit to include

sources and serial number is existing ( i'll add a gist of a script that I used for testing later to process existing files, i.e. add ncattrs)
finetuning for whether it should only be sources or also the serial via keyword
option to add a grid (major x, minor x and major y similar to minor x)
Watermark option with text settable via keyboard and keyword-settable creation time in addition
changed the ticklabels (_get_standard_time_ticks toalso return 00:00 and 24:00 (including the test for it)
added minor ticks in addition to the major ticks every 4 hours

All except the last two are set by keywords. The grid could/should allow for custom settings via a dict at some point in the future but for now is set to be visible but very non-intrusive (which required the addition of the zorder kw to all ax.plot or similar calls - hence many changed lines). As the last two are changes that would also affect the production pipeline as it is now, let me know whether I should roll those back for the PR. Below are two plots for a direct of the two (one with all options, one without, i.e. only minor ticks and all xticklabels). Disregard the shown data as they are nonsense due to some processing issues for that day. For illustration

spirrobe commented 1 year ago

Closed via PR #90, will open seperate issue for serial numbers to allow focused discussion