Closed niconoe closed 2 years ago
height
even though altitude might be more "correct" @adokter @CeciliaNilsson709?A few more precisions and questions:
vph5-to_vpts
will have to refuse to work sometimes (because thesource vp data is not consistent in terms of height). The advantage is simplicity for consumer ( CROW for example would become much more complex if - when reading the data - it cannot guess which heights will be available for the next timestamps). There's some nuance here: a simple occasional height gap is easier to deal with than totally random heights for every timestamp.[0, 200, 400, 500, 600, 800, ...]
(note the odd 500
). Then at least a consumer knows beforehand what heights to expect or select. The data could have 1 or more of those heights for a timestamp. Would that help e.g. CROW?@peterdesmet: yep, all that seems reasonable to me!
https://github.com/enram/vpts/issues/11#issuecomment-859398426:
Thanks @adokter!
For this standard, what is your suggestion for the agreement on the height (bottom or middle of the bin?). For the first version, I suggest to align the standard as much as reasonably possible with vol2bird (for the same reason, I think documenting the fact that the height is actually above mean sea lebel)
difficult one to call, I see the practical advantage to sticking to the current vol2bird output (bottom of bin), but from an analysis point-of-view the center of the bin is more informative/intuitive.
I have been following the conversation silently. I would be careful about changing standards, but clear documentation should help. A problem may arise if people compare previously and recently processed data, something to keep in mind with other repositories like the one at UvA, or people that previously downloaded the ENRAM repository. Another option is to provide a conversion table between bottom and center of the bin. Regarding altitude or height, altitude is the correct term even if height is more commonly used.
I agree with Judy that I think we should stick to what we have been doing (bottom of bin and height rather then altitude) to avoid confusing missmatch with already existing data/processing. It is after all quite simple for the user to change to mid-bin etc themselves afterwards if they want. It also makes sense to me to stick to "height" as that is what is in the meteorological data, and renaming it might give the impression it has been changed.
My experience is that both the data and the user cases can vary quite widely, especially in terms of height coverage, time intervalls etc, so I would opt for keeping it flexible where possible and to make sure its all very well documented of course.
Update: I think we can summarise the consensus like this (shout if you don't agree!):
I'll make sure this is all clearly reflected in the documentation/specifications then I'll close this issue.
Based on the last comment, I have reflected this in the format as:
To be explicit should it be mentioned that they also have the same values? (not only that the number of values should match). Also note that it is not uncommon to encounter data where the height has been calculated to either 4 or 5 km and I have encountered mixing issue there
To be explicit should it be mentioned that they also have the same values?
Ok, will do that.
Also note that it is not uncommon to encounter data where the height has been calculated to either 4 or 5 km and I have encountered mixing issue there
Can you clarify with an example?
Here is a small example of heights going to 4000 and 5000 meters:
require(bioRad)
#> Loading required package: bioRad
#> Welcome to bioRad version 0.6.0
#> Docker daemon running, Docker functionality enabled (vol2bird version 0.5.0)
f<-function(x)
{
download.file(x, t<-tempfile('.h5'))
# browser()
read_vpfiles(t)
}
h<-c('https://lw-enram.s3-eu-west-1.amazonaws.com/fr/nim/2017/11/24/02/frnim_vp_20171124T0215Z_0x7.h5',
'https://lw-enram.s3-eu-west-1.amazonaws.com/fr/nim/2016/09/21/02/frnim_vp_20160921T0215Z.h5')
f(h[1])$attributes$where$maxheight
#> [1] 5000
f(h[2])$attributes$where$maxheight
#> [1] 4000
Created on 2022-05-10 by the reprex package (v2.0.1)
@bart1 given that some radars have different max height over time, should we still add the requirement that the heights are constant across time?
As an aside: currently bioRad can't handle vpts objects that have different maximum heights or height intervals. But we could decide that we want to support that in the future. It will be quite a bit of work to implement.
Currently we state:
Data SHOULD have the same
height
s for alldatetime
s of aradar
.
SHOULD = This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
I think that covers the use case we want to cover now.
1) Do we guarantee to consumers that the same heights will be available for each timestamp? 2) Do we document the available heights in the metadata file? (the alternative is to let reader infer it from the content of the CSV file) 3) Is it "good enough" if the standard states heights are always expressed as meters above sea level (data type: positive integer)? 4) Is it better to call the column
height
oraltitude
?