hugomflavio / actel

Standardised analysis of acoustic telemetry data from fish moving through receiver arrays
https://hugomflavio.github.io/actel-website
26 stars 6 forks source link

TZ=UTC causes actel tests to fail with data.table 1.13.0 #46

Closed mattdowle closed 4 years ago

mattdowle commented 4 years ago

Hi Hugo,

data.table v1.13.0 has just been published on CRAN. actel is impacted just when TZ=UTC, e.g. "TZ=UTC R CMD check actel_1.0.0.tar.gz". actel is still passing on CRAN, and actel didn't come up in CRAN's pre-release revdep testing. I see some skip_on_cran() in your tests but that skip doesn't appear on this one that's impacted. I think it's because none of the CRAN check machines sets TZ=UTC. I set TZ=UTC in my local revdep testing which is how I detected it.

It's due to the potentially breaking change in this release, that fread now reads UTC-datetime as POSIXct directly. The datetime in your biometrics.csv, column 1, is unmarked though (i.e. no Z or UTC offset present), so it is still read as character for backwards compatibility. See note 1 in NEWS here: https://github.com/Rdatatable/data.table/blob/master/NEWS.md

However, when local time zone is set to UTC by user, fread reads biometrics.csv first column as POSIXct in UTC, and then the difference occurs later because your tests are very good.

I'm not sure it's necessary to make any change to actel, since it's passing on CRAN. But if you do want to change it to survive TZ=UTC, adding colClasses=c("Release.date"="character") would be the smallest change to loadBio so that the code afterwards in loadBio that works on Release.date as character, continues to work as before. Alternatively, you could use UTC in the csv files rather than local time, or you can pass tz="UTC" (a new argument to fread in this release) to always read unmarked datetime as UTC.

Sorry for the extra work this change causes you.

Best, Matt

TZ=UTC R CMD check actel_1.0.0.tar.gz

./actel.Rcheck/00check.log:* using log directory ‘/home/mdowle/build/revdeplib/actel.Rcheck’
./actel.Rcheck/00check.log:* using R Under development (unstable) (2020-07-14 r78854)
./actel.Rcheck/00check.log:* using platform: x86_64-pc-linux-gnu (64-bit)
./actel.Rcheck/00check.log:* using session charset: UTF-8
./actel.Rcheck/00check.log:* checking for file ‘actel/DESCRIPTION’ ... OK
./actel.Rcheck/00check.log:* this is package ‘actel’ version ‘1.0.0’
./actel.Rcheck/00check.log:* package encoding: UTF-8
./actel.Rcheck/00check.log:* checking package namespace information ... OK
./actel.Rcheck/00check.log:* checking package dependencies ... OK
./actel.Rcheck/00check.log:* checking if this is a source package ... OK
./actel.Rcheck/00check.log:* checking if there is a namespace ... OK
./actel.Rcheck/00check.log:* checking for executable files ... OK
./actel.Rcheck/00check.log:* checking for hidden files and directories ... OK
./actel.Rcheck/00check.log:* checking for portable file names ... OK
./actel.Rcheck/00check.log:* checking for sufficient/correct file permissions ... OK
./actel.Rcheck/00check.log:* checking whether package ‘actel’ can be installed ... OK
./actel.Rcheck/00check.log:* checking installed package size ... OK
./actel.Rcheck/00check.log:* checking package directory ... OK
./actel.Rcheck/00check.log:* checking ‘build’ directory ... OK
./actel.Rcheck/00check.log:* checking DESCRIPTION meta-information ... OK
./actel.Rcheck/00check.log:* checking top-level files ... OK
./actel.Rcheck/00check.log:* checking for left-over files ... OK
./actel.Rcheck/00check.log:* checking index information ... OK
./actel.Rcheck/00check.log:* checking package subdirectories ... OK
./actel.Rcheck/00check.log:* checking R files for non-ASCII characters ... OK
./actel.Rcheck/00check.log:* checking R files for syntax errors ... OK
./actel.Rcheck/00check.log:* checking whether the package can be loaded ... OK
./actel.Rcheck/00check.log:* checking whether the package can be loaded with stated dependencies ... OK
./actel.Rcheck/00check.log:* checking whether the package can be unloaded cleanly ... OK
./actel.Rcheck/00check.log:* checking whether the namespace can be loaded with stated dependencies ... OK
./actel.Rcheck/00check.log:* checking whether the namespace can be unloaded cleanly ... OK
./actel.Rcheck/00check.log:* checking loading without being on the library search path ... OK
./actel.Rcheck/00check.log:* checking dependencies in R code ... NOTE
./actel.Rcheck/00check.log:Namespaces in Imports field not imported from:
./actel.Rcheck/00check.log:  ‘fs’ ‘svglite’
./actel.Rcheck/00check.log:  All declared Imports should be used.
./actel.Rcheck/00check.log:* checking S3 generic/method consistency ... OK
./actel.Rcheck/00check.log:* checking replacement functions ... OK
./actel.Rcheck/00check.log:* checking foreign function calls ... OK
./actel.Rcheck/00check.log:* checking R code for possible problems ... OK
./actel.Rcheck/00check.log:* checking Rd files ... OK
./actel.Rcheck/00check.log:* checking Rd metadata ... OK
./actel.Rcheck/00check.log:* checking Rd cross-references ... OK
./actel.Rcheck/00check.log:* checking for missing documentation entries ... OK
./actel.Rcheck/00check.log:* checking for code/documentation mismatches ... OK
./actel.Rcheck/00check.log:* checking Rd \usage sections ... OK
./actel.Rcheck/00check.log:* checking Rd contents ... OK
./actel.Rcheck/00check.log:* checking for unstated dependencies in examples ... OK
./actel.Rcheck/00check.log:* checking contents of ‘data’ directory ... OK
./actel.Rcheck/00check.log:* checking data for non-ASCII characters ... OK
./actel.Rcheck/00check.log:* checking data for ASCII and uncompressed saves ... OK
./actel.Rcheck/00check.log:* checking R/sysdata.rda ... OK
./actel.Rcheck/00check.log:* checking installed files from ‘inst/doc’ ... OK
./actel.Rcheck/00check.log:* checking files in ‘vignettes’ ... OK
./actel.Rcheck/00check.log:* checking examples ... OK
./actel.Rcheck/00check.log:* checking for unstated dependencies in ‘tests’ ... OK
./actel.Rcheck/00check.log:* checking tests ... ERROR
./actel.Rcheck/00check.log:  Running ‘testthat.R’
./actel.Rcheck/00check.log:Running the tests in ‘tests/testthat.R’ failed.
./actel.Rcheck/00check.log:Last 13 lines of output:
./actel.Rcheck/00check.log:  ── 1. Failure: migration results contains all the expected elements. (@test_migr
./actel.Rcheck/00check.log:  output$rsp.info$bio[, 1:ncol(example.biometrics)] not equal to `example.biometrics`.
./actel.Rcheck/00check.log:  Component "Release.date": Mean absolute difference: 7200
./actel.Rcheck/00check.log:  
./actel.Rcheck/00check.log:  ── 2. Failure: residency results contains all the expected elements. (@test_resi
./actel.Rcheck/00check.log:  output$rsp.info$bio[, 1:ncol(example.biometrics)] not equal to `example.biometrics`.
./actel.Rcheck/00check.log:  Component "Release.date": Mean absolute difference: 7200
./actel.Rcheck/00check.log:  
./actel.Rcheck/00check.log:  ══ testthat results  ═══════════════════════════════════════════════════════════
./actel.Rcheck/00check.log:  [ OK: 20 | SKIPPED: 37 | WARNINGS: 3 | FAILED: 2 ]
./actel.Rcheck/00check.log:  1. Failure: migration results contains all the expected elements. (@test_migration.R#204) 
./actel.Rcheck/00check.log:  2. Failure: residency results contains all the expected elements. (@test_residency.R#189) 
./actel.Rcheck/00check.log:  
./actel.Rcheck/00check.log:  Error: testthat unit tests failed
./actel.Rcheck/00check.log:  Execution halted
./actel.Rcheck/00check.log:* checking for unstated dependencies in vignettes ... OK
./actel.Rcheck/00check.log:* checking package vignettes in ‘inst/doc’ ... OK
./actel.Rcheck/00check.log:* checking running R code from vignettes ... NONE
./actel.Rcheck/00check.log:  ‘a-0_workspace_requirements.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘a-1_study_area.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘a-2_distances_matrix.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘b-0_explore.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘b-1_explore_processes.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘b-2_explore_results.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘c-0_migration.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘c-1_migration_processes.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘c-2_migration_results.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘c-3_migration_efficiency.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘d-0_residency.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘d-1_residency_processes.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘d-2_residency_results.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘d-3_residency_efficiency.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘e-0_manual_mode.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘f-0_post_functions.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:  ‘g-0_messages.Rmd’ using ‘UTF-8’... OK
./actel.Rcheck/00check.log:* checking re-building of vignette outputs ... OK
./actel.Rcheck/00check.log:* checking PDF version of manual ... OK
./actel.Rcheck/00check.log:* DONE
./actel.Rcheck/00check.log:Status: 1 ERROR, 1 NOTE
hugomflavio commented 4 years ago

Thanks for the heads-up Matt.

I had already picked up that something was wrong (tests started failing on my latest commits), but I had not yet managed to pinpoint the issue, so this saves me a lot of work.

I will update actel's functions to account for the new data.table functionalities and prepare a new version for CRAN.

hugomflavio commented 4 years ago

New version is now passing github and travis tests, so I assume the issue is solved.

mattdowle commented 4 years ago

Great, thanks Hugo! If it passes both R CMD check actel_*.tar.gz and TZ=UTC R CMD check actel_*.tar.gz then yes it's solved. Alternatively, and probably better and easier (and what we do in data.table tests) is you can also use Sys.setenv(TZ="UTC") in your tests and fread will read that latest value set in the R session. This avoids needing to run R CMD check twice, with and without TZ=UTC. Use Sys.unsetenv("TZ") in your tests to go back to default local time zone rather than Sys.setenv(TZ="") because the latter works differently on Windows and Linux.

We'd like to change the default for fread's tz= from "" to "UTC". Because the current situation of a different type (POSIXct vs character) depending on the TZ environment variable is uncomfortable. We decided not to do that yet because it would break expectations compared to base R in this case of unmarked datetime. But in terms of revdep breakage, out of 874 packages using data.table directly only two would fail if we changed the default: actel and spatsoc. It's surprising it's so few, but it does make the prospect of a default change easier. And it looks like you're already made the necessary change in dev anyway.

Would you be ok if fread read unmarked datetime as UTC-POSIXct by default, instead of character as 1.13.0 is doing currently?

hugomflavio commented 4 years ago

Hi Matt!

Thanks for the follow up, I will make sure to include that Sys.setenv line in the tests to make sure everything is working properly.

I agree with you that having a consistent method (i.e. always reading timestamps as POSIXct) would make more sense and I am quite happy to see that new tz argument in fread. actel's functions have a mandatory tz argument where the user states the timezone of the input data, so it would be very easy for me to plug that into fread's tz argument. The only reason why I didn't do it right away is because I am not sure what would happen if someone updates actel but does not update data.table (in which case fread would not be expecting a tz argument)?

To be on the safe side for now, I will be relying on the old method (i.e. read the stamps as character strings and convert them afterwards), mostly because the loadBio and loadDeployments functions are expected to be dealing with small sized tables (so computing time is not really a problem). The main reason why I switched from read.csv to fread here was because fread can handle csv files that have been "damaged" by Excel (thanks for that!).

There's one other instance where fread has to deal with timestamps in my package (function compileDetections), but here the input files are expected to be in UTC, which fits perfectly with the new version of data.table. It would even save me the trouble of using fastPOSIXct in my processXXXfile functions (e.g. here and here).

All in all, I would have no problem with fread reading datetime data as UTC by default, because I can take advantage of the new behaviour to suit my needs :)

Thanks a lot for coding the data.table package!

mattdowle commented 4 years ago

Many thanks again Hugo! Great. Interesting to hear the background and the the related info too.

I am not sure what would happen if someone updates actel but does not update data.table (in which case fread would not be expecting a tz argument)?

In DESCRIPTION you can put the version in the import line: "data.table (>= 1.13.0)" and that way the user will be prompted to upgrade data.table too.