enram / vpts-csv

Data exchange format for biological signals detected by weather radars
https://aloftdata.eu/vpts-csv/
MIT License
3 stars 3 forks source link

Comments by @baptischmi #34

Closed peterdesmet closed 2 years ago

peterdesmet commented 2 years ago

Copied and adapted from Slack

thank you and well done! I especially like the addition of the data quality in terms of number of data points used for the main values of flight, dbd, and sd_vvp, and the 'vcp'. i have few minor comments - hope some are helpful:

peterdesmet commented 2 years ago

@baptischmi what do you mean with:

define the value below/above the minimum/maximum allowed, e.g. 'u', 'v', 'dens'.

A minimum and maximum are defined for those three variables.

adokter commented 2 years ago
peterdesmet commented 2 years ago

any thoughts on whether duplication of data is something we want to avoid at all cost?

I don't think we have to avoid it at all costs. If many people use ff and dd, it's a welcome convenience. We could opt to make u and v the required terms. We could opt to remove ff and dd (but in that case we should list how to calculate ff and dd).

Created a separate issue for the metadata #35.

adokter commented 2 years ago

I prefer to keep ff and dd, as they are easier to interpret

baptischmi commented 2 years ago

@baptischmi what do you mean with:

define the value below/above the minimum/maximum allowed, e.g. 'u', 'v', 'dens'.

A minimum and maximum are defined for those three variables.

@peterdesmet For instance, speed 'ff' above 100 m/s are already far from being reliable, but which value is set, if for some processing artefacts, you get higher speed values higher than 100 m/s? would you set the value as NA, or as the maximum value allowed (i.e. 100 m/s).

PS: This comment not valid for 'dens' (as stated in my original message), for which the maximum is 'Inf', but rather only for 'u', 'v', 'w', 'ff'.

bart1 commented 2 years ago

I just scanned the format definition. As the goal is stated quite general: VPTS is a community developed data exchange format for biological signals detected by weather radars. It seems it should be able to also possibly describe results from other possible identification algorithms. If that is less the goal this comment can be ignored otherwise I feel some of the required field are quite specific to vol2bird. For example sd_vvp and possibly dd and ff. It is not given those metrics are a result of each algorithm.

adokter commented 2 years ago

The goal should be to have it be general, and I think all the required fields will be output of any profiling algorithm, so some measure of speed (ff), direction (dd) and density (dens or dbz or eta) for each of the alitude layers. The only required field that is somewhat debatable as far as requirement is speed variability sd_vvp, but the profiling algorithms I'm aware of all provide it.

bart1 commented 2 years ago

The goal should be to have it be general, and I think all the required fields will be output of any profiling algorithm, so some measure of speed (ff), direction (dd) and density (dens or dbz or eta) for each of the alitude layers. The only required field that is somewhat debatable as far as requirement is speed variability sd_vvp, but the profiling algorithms I'm aware of all provide it.

I guess most algorithms can indeed calculate proxies for these numbers, if you know most have something implemented I guess that is good

peterdesmet commented 2 years ago

For instance, speed 'ff' above 100 m/s are already far from being reliable, but which value is set, if for some processing artefacts, you get higher speed values higher than 100 m/s? would you set the value as NA, or as the maximum value allowed (i.e. 100 m/s).

@adokter do you have any advise for what values to use when artefacts result in higher than maximum values?

adokter commented 2 years ago

I would set it as the max allowed (100 m/s) - but the min/max thresholds are chosen such that adjustments like that shouldn't be necessary, or at least very rare. I checked a few years of data for several radars in the US, and I haven't seen ff go above 60 m/s.

peterdesmet commented 2 years ago

Do you mean by "unknown" radar wavelength that vol2bird used a default setting, which is 5.3 and 10.6 for S and C-band, respectively?

@baptischmi: No, I mean that if you do not know the wavelength, but know it is a C-band radar, to then set radar_wavelength values to 5.3. @adokter are wavelength values used by vol2bird when processing? Would they always be known in produced then?

adokter commented 2 years ago

Yes, wavelength is needed to calculate eta from reflectivity dBZ. It reads it from the raw file if available as metadata. If not, it assumes it's C-band with wavelength 5.3 cm

peterdesmet commented 2 years ago

@adokter any way you would rephrase this then (without making it too complicated)?

Wavelength of the radar in cm. If unknown, use 5.3 cm for C-band and 10.6 cm for S-band radar.

I propose:

Wavelength of the radar in cm. 5.3 cm is typically assumed for C-band radar and 10.6 cm for S-band radar.

bart1 commented 2 years ago

@peterdesmet I was just thinking should the documentation indicate what columns are typically constant (within radar/ scan). e.g. radar latitude does not really change within radar. It is kind of obvious but other wise you might produce valid files that can not really be handled / processed further at least with bioRad

peterdesmet commented 2 years ago

@bart1 👍 updated in 1c9da40.

Constant for all records from the same radar.

adokter commented 2 years ago

@adokter any way you would rephrase this then (without making it too complicated)?

@peterdesmet slight tweak:

Wavelength of the radar in cm. Most C-band radars operate at approximately 5.3 cm wavelength, and most S-band radars at 10.6 cm

peterdesmet commented 2 years ago

Thanks, updated. And with that we have addressed all comments by @baptischmi listed at the start of this issue.