Collection of issues, ideas and suggestions for netCDF files

ZPYin commented 3 years ago

As a end-user, you might have a lot of experience working with the netCDF files output by Pollynet_Processing_Chain. Here, we want to collect your ~~complains~~ thoughts with regard to the products. We want to know what you think should be added to facilitate your data analysis, i.e., uncertainty, molecular backscatter and etc..

We will try our best to implement your ideas in Picasso v3.0, which will probably be released in the next year. But if you think your request needs to be resolved quickly, please submit it in a new issue.

HolgerPollyNet commented 3 years ago

I think, all used calibration constants should be stored somewhere in the nectdf either as attribute or variable.

Furthermore, I think it would be great if all the used information from the config file could be transfered to the output

HolgerPollyNet commented 3 years ago

For 3.0, one could also think of one dedicated module producing EARLINET nectdf in ADDITION to pur Picasso netcdf.

HolgerPollyNet commented 3 years ago

Concerning my previous comment, the EARLINET netcdf product should be a product for lidar dummies ;-) (let's say, user who just use the profile without the need to know anything about Polly, e.g., satellite people or modelling community).

This means only valid data is displayed. Thus, the height region of incomplete overlap is excluded and the lowest cloud top in the averaging period minus let's say 500 m is used top of the profile. If of course the target cat is working well, and the possibility in the EARLINET netcdf allows, we can also flag the height regions with clouds a clouds. But this needs to be investigated. If error analysis covers overlap region trustfully, this could also be stay in the profile IF errors are realistic.

Moritz-TROPOS commented 3 years ago

It would be great if the used calibration constant V for depolarization calibration could be stored in the netcdf. As an optional additon, the date and time when it was derived could be stored. Furthermore, the transmission ratios for each of depolarization channels (total and cross) or maybe already the GHK parameter could be stored as an attribute. If we are working with changing transmission ratios, I would propose to not use V as calibration constant which contains already the transmission ratios, but to use "eta" which is just the geometric mean (SQRT) of the + and - 45° calibration.

Moritz-TROPOS commented 3 years ago

Hello, I have checked the recent netcdf output file carefully and came up with a list of some (minor) improvements, mostly typos and clarifications:

In the long name of the depol, it should be stated "particle linear depolarization ratio" and "volume linear depolarization ratio" to be clear (linear)
In the commenf of the depol and depol_uncertainty: Raman method / Klett method with capital R / K
LR_aeronet_xxx: long name should be: "aerosol lidar ratio at xxx nm retrieved with with constrained-AOD method"
Typo in "constrained" in long name of aerBsc_aeronet_xxx and comment of LR_aeronet_xxx
In the comments of the quantities from the previous point "The result is ... " or "The results are ..."
Concerning the reference height, it should be mentioned somewhere, that it is an interval (2 values for upper and lower boundary). Maybe in the long name: Reference height interval at xxx nm or in the comment: "The reference height interval is searched ..."
In the long name of parDepol_... and uncertaintyparDepol... Klett and Raman method should be mentioned to not have the same long name twice.

Maybe, I can change these points in the code by myself.

Moritz-TROPOS commented 3 years ago

And now some ideas, to be discussed...

For the Aeronet method, could we provide the AOD and the time stamp of the closest Aeronet observation?
Should we provide the lidar ratio (Raman method) with the effective resolution? That would mean to take different smoothing lengths for extinction and backscatter coefficient.
Error estimates for the derived products would be cool, but require much more work.
I've just checked global attributes of the netcdf files from the lidar products of the Max Planck Institute. There, a copyright statement is included "The copyright for these data is with the Max-Planck-Institut fuer Meteorologie, Hamburg. Any use is subject to written consent of the copyright owner." Do we need anything similar? Furthermore, I found "DetectionMode = "Photoncounting". All Pollies work in photocounting mode, but maybe it could be mentioned.

HolgerPollyNet commented 3 years ago

Thanks you Moritz for your ideas.

1) In general I agree, all information which is read from the config file should be transferred to the output netcdf. I also think it is a great idea to store the date of the calibration constants used (Lidar constant, water vapour, depol) and the information from other auxiliary data as the Aeronet site and AOD and time of observations. If you like, you can try to work on it. Zhenping might help with a brief introduction on which branch and which file.

2) would be another issue, and I think currently out of scope for Picasso 3.0 release. But we can think about it later.

3) is in work: https://github.com/PollyNET/Pollynet_Processing_Chain/issues/54

4) We have now Tropos data policy which put all data under CC-BY-SA, nevertheless, as also other institutes are involved, we need to think about something like this with care. I open another issue.

HolgerPollyNet commented 3 years ago

@Moritz-TROPOS
Concerning your first comment, please also have in mind that we have also the follwoing issue: https://github.com/PollyNET/Pollynet_Processing_Chain/issues/79

Given your experience you could also start working on that! We welcome every contribution.

HolgerPollyNet commented 3 years ago

From Patric:

1) The Picasso output files (att_bsc, vol_depol) do not contain any information about zenith angle. Would be good to have it included. It would also be required in case someone wants to calculate optical properties from the AttBSC files (to account for range effects). Specifically CloudnetPy wants to have both, range and height, in the calibrated files. Sure, one can assume an angle of 5°, but it would be better to have that value dynamically from the input file.

2) The attribute names are not consistent. Some start uppercase, some lowercase. This makes an automatic processing of the files somewhat inconvenient, as it also implies that the attribute names are still ‘under construction’ and subject to changes.

ZPYin commented 3 years ago

From Patric:

The Picasso output files (att_bsc, vol_depol) do not contain any information about zenith angle. Would be good to have it included. It would also be required in case someone wants to calculate optical properties from the AttBSC files (to account for range effects). Specifically CloudnetPy wants to have both, range and height, in the calibrated files. Sure, one can assume an angle of 5°, but it would be better to have that value dynamically from the input file.

The attribute names are not consistent. Some start uppercase, some lowercase. This makes an automatic processing of the files somewhat inconvenient, as it also implies that the attribute names are still ‘under construction’ and subject to changes.

These are very good suggestions.

Regarding saving zenith angle, it can be accomplished by editing each data saving script (which is under io folder).

Regarding attribute names, I tend to use lowercase. But I need to point out that EARLINET database takes some global attributes with leading character in uppercase, i.e., Conventions, Data_Originator, etc. And that's why we have some attributes starting in uppercase. As a solution, we can comply with startting in lowercase to make cloudnet processing satisfied. And we change the polly2earlinetDB scripts to deal with the lowercase/uppercase issues, in case EARLINET format can also be assured.

PollyNET / Pollynet_Processing_Chain

Collection of issues, ideas and suggestions for netCDF files #81