Open sebdalgarno opened 6 years ago
I like my only suggestions are that it should be function(data, year = "Year", month = "Month", day = "Day", hour = "Hour", minute = "Minute", second = "Second", col = "DateTime", remove = TRUE, tz = getOption(poisix.tz, "Etc/GMT+8"))
because
1) folks may only want to change the column name of 1 of the columns 2) the conversion can occur afterwards 3) the default time should be a global option
An interesting idea is if for example Second = 0L
then the seconds are automatically set to 0 (useful if column missing)
I wonder if there should be two functions for datetime.
ps_ymdhms_datetime
ps_dt_datetime
Seems to me that these are the two most likely scenarios
This opens a whole can of worms as datetimes can come in so many formats, but if we can capture the most common...
An elegant solutions seems to be to just have the one constructor function based on year, month, day, hour, minute, second etc plus a generic deconstruction function that takes a single data/time object and decomposes it into Year, Month, Day or whatever makes sense. The user can then break apart and put back together as they see fit. The only trick thing is what to do about the tz. We could add this to a column as well. Thoughts?
I like the idea. I also like the construct verb better than create. Could the deconstruction function just deal with character instead of date/time?...that way tz only comes in once the construction happens.
I think the question of how to go from a character to a date time is different to the question of how to decompose a datetime (which is useful for combining) versus how to construct a date time from year, month, day etc. The character to date time functions in lubridate are well developed so may handle this problem?
yes the lubridate functions are very good and no need to reinvent the wheel. If there is a character Date and Time column it probably just makes most sense to do
lubridate::ymd_hms(paste(Date, Time))
I think constructing from year, month, day, hour, minute, second is still useful though.
perhaps there could be an option in ps_separate_datetime/date
to add tz column, which might be useful if planning to deconstruct and reconstruct.
yes I think the tz argument which could be NULL by default indicating don't create when deconstructing is the best solution. also the deconstruction function should include year = "Year" etc to allow the user to name each of the name columns or in the case of year = NULL to not construct at all.
also it might be worth having the arguments suffix = "" and prefix = "" which as character scalars allow the user to quickly change all the column from the default of Year, Month, Day, etc to SiteYear, SiteMonth, SiteDay simply by setting suffix = "Site"
yes great ideas. Do you want me to try to take this one on? The function seems to be almost done.
yes please
what is the best way to allow both a column name or an integer?:
e.g. second = "Second"
or second = 0L
This?
if(check_string(second)) second -> data[[second]]
if(check_vector(second, 1L)) second -> second
# repeat for year, month, day, hour, minute and paste
checkor(check_string(second), check_count(second, coerce = TRUE))
checks that second is a string (character vector of length 1) or a count (non-negative integer of length 1) or a numeric of length 1 that can be coerced to a count
check_colnames(data, second)
checks that data has a column with the same name as the value of second and check_vector(second, length = 1L, values = c(0L,59L,NA))
checks that second is a integer of length 1 with values between 0 and 59 and possibly missing values
OK that's great, but then what is best way to take a combination of integer and column names and paste into a format that lubridate::ymd_hms can read?
if second = 0L
for example create a new column in the data frame called say ..second (after checking that ..second is not already an existing column) and then set second <- "..second" and carry on as if the user passed second as a column name
note check_missing_colnames(data, "..second")
may be useful
opposite of
ps_separate_date/datetime
Takes a vector of column names to create POSIXct or Date, assign tz, convert tz, automatically remove columns used to create datetime/date something like: