adokter / bioRad

R package for analysis and visualisation of biological signals in weather radar data
http://adokter.github.io/bioRad
Other
29 stars 16 forks source link

rename function arguments #112

Closed adokter closed 6 years ago

adokter commented 6 years ago

So far we have focussed on renaming the function names, but the function argument names also need improvements. E.g. integrate_profile has alt.min argument, which still has the dot convention we want to deprecate. This is still quite a lot of work..., but I think we cannot push anything to master until this is done.

We could contact the editors of Ecograpy to forward install instructions to the editors of the paper to a branch that has not yet renamed arguments, only function names. Then we don't have to release to the public yet and have more time.

Functions

Internal functions

adokter commented 6 years ago

See https://github.com/adokter/bioRad/issues/51#issuecomment-374747768 for how to handle deprecation/renaming of arguments such that package remains backward compatible

adokter commented 6 years ago

This issue is related to issue #94. I noticed some functions have been ticked there, even when their parameter names are not optimal (e.g. containing dots)

peterdesmet commented 6 years ago

Agreed, it is something we should tackle: I’ll add a todo list to this issue.

Ok for creating branch and contacting editor. Would be nice if #108 and #117 are accepted and included.

Updated #94: that should only concern documentation, not parameters.

adokter commented 6 years ago

108 and #117 have been merged into develop

peterdesmet commented 6 years ago

@adokter, we'll let you make a suggestion for standardized and logical parameter names, before we provide further input.

adokter commented 6 years ago

Internal functions

adokter commented 6 years ago

@peterdesmet @stijnvanhoey Any chance you can have a look at this before the training school in Oklahoma starting August 14? I'd like to push for a new release by then with corrected function arguments.

peterdesmet commented 6 years ago

Hi @adokter I created a file with all the parameters for all functions: parameters.txt. Makes it easier to compare these. Here's my review:

Be consistent with underscore:

Use underscore where it might make sense (i.e. these parameters don't have an underscore yet, but it would makes sense to do it). These are suggestions:

Adriaan: I'm not sure about the changes below, because many of R's default functions use xlab, xlim etc. Many people are there very much used to it. I added latlim/lonlim myself to match the R convention. lab stands for label. Maybe @stijnvanhoey can weigh on this as well? > Keep as is.

What is the difference between (should they get the same name?):

Unintuitive names:

Maybe write in full:

adokter commented 6 years ago

Thanks @peterdesmet, added some thoughts in italic

peterdesmet commented 6 years ago

Ok!

  1. Can you update the name suggestions that you approved (azim_bin, range_bin)
  2. Keep row.names: ok
  3. xlab and co: fine to keep as is. @stijnvanhoey is on holiday
  4. proj4string: if part of proj4, maybe call it proj4_string then? Adriaan: I would stick with proj4string, this is the argument that is also used in the widely used sp package.
  5. Let's use file then? Adriaan: OK
  6. Let's use files then? Adriaan: Or also file for consistency? E.g. in read_vpfiles can also be used to read a single file.
  7. Let's use path or maybe directory (to differentiate from file) then? Adriaan: ok, no preference
  8. coords_polar: would only update if we change coords as well, so best to keep it as is. Adriaan: ok, it's a hidden function anyhow, so not very important
adokter commented 6 years ago

Sounds good these names, added comments in your comment above

peterdesmet commented 6 years ago
  1. proj4string: ok to keep

5/6. I would keep the difference between file and files. Using file for all makes it less clear which functions do support more than a single file (and not all functions do). For those that do support more, I think files is intuitive enough that you can supply a list of one. Note regarding filename vs file: readr also uses file, which aligns nicely. 👍 Combined with directory, we would get:

download_vpfiles(start_date, end_date, country, radar, directory = ".")
is.pvolfile(file)
get_odim_object_type(file)
read_pvolfile(file, ...)
read_vpfiles(files)
read_vpts(file, ...)
read_vpfiles(files)
read_vpts(file, ...)
select_vpfiles(directory, ...)
is.vpfile(file)

# internal
is.odimfile(file)
read_pvolfile_scan(file, ...)
read_pvolfile_quantity(file, ...)
read_vp(file)
quantity_name(file, ...)
read_odim_profile_data(file, ...)
match_filenames((filelist, regexlist)) # this is an internal function, I would keep the arguments as such, maybe but with underscore
  1. I would also update col to color in the internal function add_color_transparency(col, alpha = 1)

If you agree, can you update the list at https://github.com/adokter/bioRad/issues/112#issuecomment-394833402, so we have an overview of what needs to be changed?

adokter commented 6 years ago

Update the list at https://github.com/adokter/bioRad/issues/112#issuecomment-394833402. Only check items have changes

stijnvanhoey commented 6 years ago

For the project_as_ppi, we adapted latlim and lonlim to ylim and xlim respectively, but here we keep the lat and lon description? Keep that as such of should we always use either y/x or lat/lon?

stijnvanhoey commented 6 years ago

Regarding the param versus quantity usage as argument, I do think we are introducing some inconsistencies with the above proposed set:

param.R                                       # scan parameter
read_pvolfile.R: read_pvolfile(file, param,...)  # scan parameter
composite_ppi(x, quantity = "DBZH",...)  # scan parameter <- ISSUE
get_quantity(...)    # quantity of a vp/vpts
map.R: map(x, map, quantity,...)  # scan parameter  <- ISSUE
plot.ppi.R: plot(x, quantity,...) # scan parameter  <- ISSUE
plot.scan.R: plot(x, quantity,...) # scan parameter  <- ISSUE
plot.vp.R: plot(x, quantity = "dens",...) # quantity of a vp/vpts
plot.vpi.R: plot(x, quantity = "mtr",...)   # quantity 
plot.vpts.R: plot(x, xlab = "time", ylab = "height [m]", quantity,...)  # quantity 

color_scale.R: color_scale(quantity,...   <- ISSUE
color_scale.R: color_scale_fill(quantity,...  <- ISSUE
map.R: get_zlim(quantity) <- ISSUE
sample_polar(quantity,... <- ISSUE

For each of the items with <- ISSUE I will NOT use quantity as argument, but param, corresponding to the scan parameter see https://github.com/adokter/bioRad/commit/985884d1ee29544668036105d9cc1d85fb234751

stijnvanhoey commented 6 years ago

Notice, in the overview the functionsd_vvp is defined, but the function has been renamed to sd_vvp_threshold https://github.com/adokter/bioRad/commit/34e42a2ca869687436e4c9b3d978bf872b62fbcb

adokter commented 6 years ago

Regarding param versus quantity in https://github.com/adokter/bioRad/issues/112#issuecomment-409850808, it might actually be more easy for users to always have quantity, the difference between a scan parameter or a derived quantity is pretty subtle, and most people might not understand it.

adokter commented 6 years ago

https://github.com/adokter/bioRad/issues/112#issuecomment-409827982: you're right this might be confusing. I would be ok with replacing latlim with ylim and lonlim with xlim

stijnvanhoey commented 6 years ago

While implementing and checking docs etc. I felt having a difference between the scan parameters according to the OPERA data information versus vp/vpts/... quantities is less confusing than quantities everywhere (more intuition than real argument). @peterdesmet any thoughts?

peterdesmet commented 6 years ago

I think we have had the param vs quantity argument before and opted to keep param as that is the (unchangeable) name that is used in a pvol file: pvol > scan > param?

There might be some gain in using a single name for param/quantity, but not sure how much. I notice that in the bird profile things are called quantity. @adokter I'll let you decide this one.

peterdesmet commented 6 years ago

Note: I have updated https://github.com/adokter/bioRad/issues/112#issuecomment-394833402 to only include those functions that we change (you can still see the full list at the beginning of this issue or in the version history of the comment). I'll check them off when reviewing https://github.com/adokter/bioRad/pull/127.

adokter commented 6 years ago

@peterdesmet @stijnvanhoey I think we have to go with quantity only. Currently indeed we have quantity refer to different data columns in a vp, and param refer to different data sweeps in a scan. But these get mixed up already, e.g. we have DBZH as a quantity in a vp, but also as a param in a scan. Also, I might add in the future the option to plot scan parameters like DBZH as a transformed quantity (e.g. as reflectivity eta, which is currently only a quantity in a profile).

In short: the two are getting mixed up already, so I think the least confusing is to stick to one.

peterdesmet commented 6 years ago

Makes sense, “quantity” it is. @stijnvanhoey, can you update this in #127? @adokter, does this affect the figure/text in the paper?

stijnvanhoey commented 6 years ago

So, if we only keep quantity, what happens with the file param.R?

#' Object of class \code{param}: a parameter of a scan of a polar volume
#'
...
#' @details
#' Scan parameters are simple matrices, with the following specific attributes:

will this become a quantity class? and do we speak about Scan quantities are...?

I do think the class param is counterintuitive with using scan quantities.

peterdesmet commented 6 years ago

@adokter https://github.com/adokter/bioRad/pull/127 was merged, but there is still the open issue of scan param vs quantity to be resolved. Let us know what you decide.

adokter commented 6 years ago

Hi @peterdesmet, you're right that one is still open, @stijnvanhoey got me doubting again. Letting go of param breaks the OPERA/ODIM convention, which formed the backbone of the naming conventions. So I'm hesitant to introduce scan quantities when OPERA uses scan parameters. So there is sense in keeping param strictly for the quantities in the low-level data after all (polar volumes and scans), as you wrote in https://github.com/adokter/bioRad/issues/112#issuecomment-411070067, and use quantity strictly for data columns in profiles, and for derived/transformed quantities.

I already have (soon to be included in bioRad) functionality that makes rasters (stored as ppi objects) of derived quantities (i.e. raster images of vertically integrated bird density, using information of multiple elevations scans to make one image). However, these ppi's always have only one quantity in the raster, therefore there is no need to specify a quantity/param argument when plotting them with map or plot.ppi. A ppi with multiple raster layers in fact only occurs now when making a ppi of a scan, and these only contain params.

So I think we can keep param, and nothing needs to be changed.

Here a listing of functions with param/quantity arguments for future reference

adokter commented 6 years ago

think we can close this one:)