r-lib / rex

Friendly regular expressions for R.
https://rex.r-lib.org
Other
333 stars 27 forks source link

access to rex functions and shortcuts not possible on R-devel #60

Closed FelixErnst closed 4 years ago

FelixErnst commented 4 years ago

Hi,

somehow rex does not behave as expect on R-devel

library(rex)

# does not work
capture(alpha,any_of(alnum,"."),alnum, name = "pkg") 
> Error in capture(alpha, any_of(alnum, "."), alnum, name = "pkg"): could not find function "capture"

# does work
rex:::capture(shortcuts$alpha,rex:::any_of(shortcuts$alnum,"."),shortcuts$alnum, name = "pkg") 
> (?<pkg>[[:alpha:]][[:alnum:].]*[[:alnum:]])

# info
sessionInfo()
> R Under development (unstable) (2020-01-28 r77738)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 10 (buster)
> 
> Matrix products: default
> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
> 
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> other attached packages:
> [1] rex_1.1.2
> 
> loaded via a namespace (and not attached):
>  [1] compiler_4.0.0  magrittr_1.5    tools_4.0.0     htmltools_0.4.0
>  [5] yaml_2.2.1      Rcpp_1.0.3      stringi_1.4.5   rmarkdown_2.1  
>  [9] highr_0.8       knitr_1.27      stringr_1.4.0   xfun_0.12      
> [13] digest_0.6.23   rlang_0.4.4     evaluate_0.14

Apparently the functions are not exported. Checking the NAMESPACE file in this repo, this seems to be the case. I don't know, why this doesn't happen on R 3.6.2, but there seems to be a new behavior on R-devel.

My money would be on a problem in the register function

Felix

FelixErnst commented 4 years ago

@jimhester

~In addition re_matches function does not work anymore. The following snippet produces a valid output in R 3.6.2, but not on R-devel~

dat <- read_html(sprintf('http://bioconductor.org/checkResults/3.5/bioc-LATEST/',version))
rowspan <- length(html_text(html_nodes(dat,xpath='/html/body/table[@class="node_specs"]/tr[@class!=""]')))
pkgnames <- html_text(html_nodes(dat,xpath=sprintf('/html/body/table[@class="mainrep"]/tr/td[@rowspan="%s"]',rowspan)))
y <- re_matches(pkgnames,
                 rex(
                   start,
                   # matches .standard_regexps()$valid_package_name
                   rex:::capture(shortcuts$alpha,rex:::any_of(shortcuts$alnum,"."),shortcuts$alnum, name = "pkg"),
                   maybe(shortcuts$any_blanks),
                   # matches .standard_regexps()$valid_package_version
                   rex:::capture(rex:::between(rex:::group(shortcuts$digits,rex:::character_class(".-")),1,""),shortcuts$digits, name = "version"),
                   maybe(shortcuts$any_blanks),
                   rex:::capture(shortcuts$anything,name='author'),
                   "Last",shortcuts$anything,"Commit:",
                   rex:::capture(shortcuts$anything,name="commit"),
                   "Last",shortcuts$anything,'Changed',shortcuts$anything,"Date:",shortcuts$any_non_alnums,
                   rex:::capture(rex:::any_of(list(shortcuts$digit,'-',shortcuts$blank,':')),name='last_changed_date')
                 ))

~On R-devel all values are NA.~

edit: the last bit is a problem with the function regexpr. So disregard this edit2: I posted the problem with differing regexpr behavior to bug-report-request@r-project.org using the following code snippet:

dat <- read_html('http://bioconductor.org/checkResults/3.5/bioc-LATEST/')
pkgnames <- html_text(html_nodes(dat,xpath='/html/body/table[@class="mainrep"]/tr/td[@rowspan="3"]'))
pattern <- "^(?<pkg>[[:alpha:]][[:alnum:].]*[[:alnum:]])(?:[[:blank:]]*)?(?<version>(?:(?:[[:digit:]]+[.-])){1,}[[:digit:]]+)(?:[[:blank:]]*)?(?<author>.*)Last.*Commit:(?<commit>.*)Last.*Changed.*Date:[^[:alnum:]]*(?<last_changed_date>[[:digit:]\\-[:blank:]:]*)"
regexpr(pattern = pattern, text = pkgnames, perl = TRUE)

Correct positions are not returned on R-devel (r77738)

jimhester commented 4 years ago

They are not supposed to be exported, they are used within a call to rex. You can use rex::rex_mode() to temporarily put the shortcuts on the search path to make autocompletion when writing rex expressions easier.

FelixErnst commented 4 years ago

Thanks for the clarification. I couldn't find anything about this in the man pages. Did I miss it?

jimhester commented 4 years ago

It is documented at https://github.com/kevinushey/rex#rex-mode

FelixErnst commented 4 years ago

Thanks. I totally missed this. I was looking for a hint in the man pages.