Parse geographic coordinates
strange error on unbuntu-20.04 (devel) #45

Open rafapereirabr opened 1 year ago

rafapereirabr commented 1 year ago

Hi all. Thanks for developig the {parzer} package. It is super handy! I used it as a dependency in my {flightsbr} package. The problem is that I've recently noticed this strange error with {parzer} in my cmd-cheks using unbuntu-20.04 (devel). Plase see the error message below, or the full log of cmd checks here. I'm not sure what could be causing this , specially because this error does not occur on any other server, and I cannot reproduce the error on my local machine. I've included below a reproducible code that is very similar to the code throwing that error.

Error message:

> Running the tests in ‘tests/testthat.R’ failed.
> Last 13 lines of output:
>   Backtrace:
>       ▆
>    1. └─flightsbr::read_airports() at test_read_airports.R:12:4
>    2.   └─flightsbr:::latlon_to_numeric(dt_private)
>    3.     ├─df[, `:=`(latitude, parzer::parse_lat(latitude))]
>    4.     └─data.table:::`[.data.table`(df, , `:=`(latitude, parzer::parse_lat(latitude)))
>    5.       └─base::eval(jsub, SDenv, parent.frame())
>    6.         └─base::eval(jsub, SDenv, parent.frame())
>    7.           └─parzer::parse_lat(latitude)
>    8.             └─parzer:::scrub(lat)
>    9.               └─base::gsub("[^A-Za-z0-9\\.\\ ,'-]|d|g", "'", x)
>   [ FAIL 1 | WARN 14 | SKIP 0 | PASS 73 ]
>   Error: Test failures

Reproducible example


# download and read data
url <- ''

dt <- fread(url, skip = 1,
            encoding = 'UTF-8',
            colClasses = 'character')

# fix column names to lower case
pbl_names <- unlist(c(dt[1,]))
pbl_names <- iconv(pbl_names, from = 'utf8', to = 'utf8')
data.table::setnames(dt, tolower(pbl_names) )
dt <- dt[-1,]

# check column values to parse
#> "8° 20' 55'' S"

# convert to numeric
dt[, latitude := parzer::parse_lat(latitude) ]
dt[, longitude := parzer::parse_lon(longitude) ]

#> -8.348611
Session Info ```r > sessionInfo() R version 4.1.1 (2021-08-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22000) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.14.6 parzer_0.4.1 flightsbr_0.2.0 loaded via a namespace (and not attached): [1] Rcpp_1.0.9 withr_2.5.0 brio_1.1.3 R6_2.5.1 [5] lifecycle_1.0.3 magrittr_2.0.3 httr_1.4.4 rlang_1.0.6 [9] cli_3.4.1 curl_4.3.3 rstudioapi_0.14 testthat_3.1.5 [13] xml2_1.3.3 tools_4.1.1 glue_1.6.2 compiler_4.1.1 [17] rvest_1.0.3 ```
mpadge commented 1 year ago

Thanks @rafapereirabr. Note that @AlbanSagouis has recently taken over maintenance of this package, and he and i are rewriting and improving a lot of the C++ code in #44. The problem is that that is on a branch in his own GitHub profile. I think you could still replace standard install on GitHub actions with Alban's version via remotes: AlbanSagouis/parzer@cpp11-optimisation, but ... the branch is currently failing anyway, so that's clearly not going to help.

@AlbanSagouis maybe this is the kind of kick we need to get us moving on merging your long-standing PR? @rafapereirabr Can you bear with us for a bit while we expedite the re-write? Hopefully :smile:

rafapereirabr commented 1 year ago

Hi @mpadge, thanks for the reply ! No worries, this is not an urgent matter on my side but I'm glad to hear you've been cooking this major improvment to the package (which is already great, tbh)