reconhub / linelist

An R package to import, clean, and store case data
https://www.repidemicsconsortium.org/linelist
Other
25 stars 5 forks source link

guess_dates should accept vector of column names in clean_dates #76

Open scottyaz opened 5 years ago

scottyaz commented 5 years ago

It would be helpful to also allow for character vector specifying column names (or even in tidy notation col_a:col_m. Note this is not a bug.

zkamvar commented 5 years ago

guess_dates() explicitly works on vectors. Would this not be taken care of by the clean_dates() function, which wraps guess_dates()?

scottyaz commented 5 years ago

Sorry, I meant the guess_dates argument of clean_dates seems to only accept logical or numeric vectors. Its easy enough to figure out which column numbers we want but would make it easier for many to just be able to specify the names of the columns.

thibautjombart commented 5 years ago

Not entirely sure how straightforward the added feature would be. @zkamvar do you think this belong to the prep of release 0.1.0 or shall we put a pin in it for later releases?

For what it's worth, an easy workaround would be, e.g. with dates columns containing the date character string:

x %>% 
  clean_data() %>% 
  mutate_at(vars(contains("date")), guess_dates, error_tolerance = 1)

Assuming a default guess_dates = FALSE in clean_data, cf PR https://github.com/reconhub/linelist/pull/103

zkamvar commented 5 years ago

Because there is a workaround, I would be much happier to put this on the backburner.

thibautjombart commented 5 years ago

Because there is a workaround, I would be much happier to put this on the backburner.

Perf, untagging this from the project for the first release, and we can always get to this later.

@scottyaz most welcome to PR this if you feel like it ;)