adokter / bioRad

R package for analysis and visualisation of biological signals in weather radar data
http://adokter.github.io/bioRad
Other
28 stars 16 forks source link

Return quantities in as.data.frame() in same order as listed in get_quantity() #382

Open peterdesmet opened 4 years ago

peterdesmet commented 4 years ago

This currently differs from get_quantity() and is different for as.data.frame(vp) and as.data.frame(vpts). Correct order:

radar
datetime

height
u
v
w
ff
dd
sd_vvp
gap
dbz
eta
dens
DBZH
n
n_all
n_dbz
n_dbz_all

lat
lon
height_antenna

day
sunrise
sunset

Update test to verify:

https://github.com/adokter/bioRad/blob/d5f19bfe851101d82355320851b5c206f6fbb136/tests/testthat/test-as.data.frame.R#L22-L29

adokter commented 4 years ago

Just a heads up that this function and vpts objects in general should be able to deal with additional user-added profile quantities. I already use this feature to add wind information (u_wind, v_wind) to the profile, and one function already has hidden functionality to calculate airspeed and heading from these wind quantities (by subtracting the wind speed from the ground speed), see: https://github.com/adokter/bioRad/blob/5ee47fe2058d15e640aa28d80fb83491cde6165e/R/integrate_profile.R#L215-L222

adokter commented 3 years ago

The orders are indeed different (surprise!), I think it's related to whether the file was read from .h5 or from flat text with read_vpts(). We don't have control over the order in which columns are loaded from .h5 files (may be different from the order in which they were saved, a peculiarity of .h5). In below example the order of example_vp is mixed up, while example_vpts has the default order.

> names(as.data.frame(example_vp, geo=F,suntime=F))
 [1] "radar"     "datetime"  "ff"        "dbz"       "dens"      "u"         "v"         "gap"      
 [9] "w"         "n_dbz"     "dd"        "n"         "DBZH"      "height"    "n_dbz_all" "eta"      
[17] "sd_vvp"    "n_all"    
> names(as.data.frame(example_vpts, geo=F,suntime=F))
 [1] "radar"     "datetime"  "height"    "u"         "v"         "w"         "ff"        "dd"       
 [9] "sd_vvp"    "gap"       "dbz"       "eta"       "dens"      "DBZH"      "n"         "n_dbz"    
[17] "n_all"     "n_dbz_all"

So seems we will have to enforce a column order in read_vpfiles(), which is a little unfortunate as it would be nice not to hard code any names, as things may be added in the future.