r-rudra / tidycells

Automatic transformation of untidy spreadsheet-like data into tidy form
https://r-rudra.github.io/tidycells/
Other
83 stars 10 forks source link

CRAN Issues Fix. CRAN archived. #39

Open bedantaguru opened 4 years ago

bedantaguru commented 4 years ago

Got it archived. Devlopment cycle need to fix urgently.

bedantaguru commented 4 years ago

Make an intermediate silent release. Target May 14 noon. Most of the things can be kept as it is.

bedantaguru commented 4 years ago

I think at this point the only option is to perform quick fix in the main repo.

Taken a fork here for future reference.

Need to fix things in the main repo and release a patch.

bedantaguru commented 4 years ago

I think last error was coming in https://github.com/r-rudra/tidycells/blob/3ddcdf6e9f664ffc46f432c57f0eeb4cc9bb13a3/tests/testthat/test-optional_package.R#L22

bedantaguru commented 4 years ago

~Just checked it is mostly happening for dplyr 1.0.0~ https://github.com/tidyverse/dplyr/issues/5211

As suggested by Hadley

mydev::beta_lib_loc()
library(dplyr, warn.conflicts = F)
packageVersion("dplyr")
#> [1] '0.8.99.9002'
packageVersion("vctrs")
#> [1] '0.3.0'

dat <- iris %>% head()

# class(dat) <- c(class(dat),"test")
class(dat) <- c("test", class(dat))

iris %>% head() %>% select(Species, Sepal.Length)
#>   Species Sepal.Length
#> 1  setosa          5.1
#> 2  setosa          4.9
#> 3  setosa          4.7
#> 4  setosa          4.6
#> 5  setosa          5.0
#> 6  setosa          5.4
dat %>% select(Species, Sepal.Length)
#>   Species Sepal.Length
#> 1  setosa          5.1
#> 2  setosa          4.9
#> 3  setosa          4.7
#> 4  setosa          4.6
#> 5  setosa          5.0
#> 6  setosa          5.4

iris %>% vctrs::vec_assert()
dat %>% vctrs::vec_assert()

Created on 2020-05-11 by the reprex package (v0.3.0)

bedantaguru commented 4 years ago

https://github.com/nacnudus/unpivotr/issues/35

My bad all mails filtered in junk folder god knows how..

bedantaguru commented 4 years ago

as_cell_df is quite slow and hoping a lot in S3 method dispatch.. need to fix it..

See https://github.com/r-rudra/tidycells/issues/40

bedantaguru commented 4 years ago
RHub Console Output 1 ```console * using log directory 'C:/Users/USERCBPbIvzeHV/tidycells.Rcheck' * using R Under development (unstable) (2020-04-22 r78281) * using platform: x86_64-w64-mingw32 (64-bit) * using session charset: ISO8859-1 * using option '--as-cran' * checking for file 'tidycells/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'tidycells' version '0.2.2.9000' * package encoding: UTF-8 * checking CRAN incoming feasibility ... NOTE Maintainer: 'Indranil Gayen ' New submission Package was archived on CRAN Version contains large components (0.2.2.9000) Found the following (possibly) invalid URLs: URL: https://cran.r-project.org/web/checks/check_results_tidycells.html From: README.md Status: 404 Message: Not Found * checking package namespace information ... OK * checking package dependencies ... OK * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for executable files ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking serialization versions ... OK * checking whether package 'tidycells' can be installed ... OK * checking installed package size ... OK * checking package directory ... OK * checking for future file timestamps ... OK * checking 'build' directory ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the package can be unloaded cleanly ... OK * checking whether the namespace can be loaded with stated dependencies ... OK * checking whether the namespace can be unloaded cleanly ... OK * checking loading without being on the library search path ... OK * checking use of S3 registration ... OK * checking dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... OK * checking Rd metadata ... OK * checking Rd line widths ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking Rd contents ... OK * checking for unstated dependencies in examples ... OK * checking installed files from 'inst/doc' ... OK * checking files in 'vignettes' ... OK * checking examples ... ERROR Running examples in 'tidycells-Ex.R' failed The error most likely occurred in: > base::assign(".ptime", proc.time(), pos = "CheckExEnv") > ### Name: as_cell_df > ### Title: Transform data into Cell-DF Structure > ### Aliases: as_cell_df > > ### ** Examples > > > as_cell_df(iris) A Cell Data Frame = To see cell stats, call summary() = To see cell structure, call plot() = Content: Error in loadNamespace(name) : there is no package called 'utf8' Calls: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart Execution halted * checking for unstated dependencies in 'tests' ... OK * checking tests ... ERROR Running 'testthat.R' [123s] Running the tests in 'tests/testthat.R' failed. Last 13 lines of output: 43. base::asNamespace(ns) 44. base::getNamespace(ns) 45. base::loadNamespace(name) 46. base::withRestarts(stop(cond), retry_loadNamespace = function() NULL) 47. base:::withOneRestart(expr, restarts[[1L]]) 48. base:::doWithOneRestart(return(expr), restart) == testthat results =========================================================== [ OK: 151 | SKIPPED: 4 | WARNINGS: 0 | FAILED: 3 ] 1. Error: numeric_values_classifier works (@test-VA_classifier.R#19) 2. Error: read_cells for NULL works (@test-read_cells.R#4) 3. Error: read_cells for external packages works (@test-read_cells.R#22) Error: testthat unit tests failed Execution halted * checking for unstated dependencies in vignettes ... OK * checking package vignettes in 'inst/doc' ... OK * checking re-building of vignette outputs ... WARNING Error(s) in re-building vignettes: ... --- re-building 'tidycells-intro.Rmd' using rmarkdown Quitting from lines 298-314 (tidycells-intro.Rmd) Error: processing vignette 'tidycells-intro.Rmd' failed with diagnostics: there is no package called 'utf8' --- failed re-building 'tidycells-intro.Rmd' SUMMARY: processing the following file failed: 'tidycells-intro.Rmd' Error: Vignette re-building failed. Execution halted * checking PDF version of manual ... OK * checking for non-standard things in the check directory ... OK * checking for detritus in the temp directory ... OK * DONE Status: 2 ERRORs, 1 WARNING, 1 NOTE ```
RHub Console Output 2 ```console R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(testthat) > library(tidycells) > > > test_check("tidycells") ── 1. Failure: read_cells: csv works (@test-read_cells.R#75) ────────────────── `dc0` not equal to `expct_d2`. 2/20 mismatches x[1]: "12" y[1]: "1.5" x[2]: "1.5" y[2]: "12" Support present for following type of files: csv, xls, xlsx, docx, pdf, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✔ ✔ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ Support present for following type of files: csv, xls, xlsx, docx, pdf, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✔ ✔ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ ══ testthat results ═══════════════════════════════════════════════════════════ [ OK: 182 | SKIPPED: 2 | WARNINGS: 0 | FAILED: 1 ] 1. Failure: read_cells: csv works (@test-read_cells.R#75) Error: testthat unit tests failed Execution halted ```
RHub Console Output 3 ```console R Under development (unstable) (2020-05-07 r78381) -- "Unsuffered Consequences" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(testthat) > library(tidycells) > > > test_check("tidycells") ── 1. Error: etc works ──────────────────────────────────────────────────────── Java Exception .jcall(cell, "Lorg/apache/poi/ss/usermodel/RichTextString;", "getRichStringCellValue")new("jobjRef", jobj = , jclass = "java/lang/Throwable") Backtrace: 1. tidycells:::read_xls_from_xlsx(dex) 6. tidycells:::read_xls_for_tidycells(fn) 11. purrr::map(., for_a_sheet) 16. tidycells:::.f(.x[[i]], ...) 13. purrr::map(., xlsx::getCellValue) 26. xlsx:::.f(.x[[i]], ...) 28. rJava::.jcall(...) 29. rJava::.jcheck(silent = FALSE) ── 2. Failure: optional_package dependency test (@test-optional_package.R#25) ─ `d1` not equal to `d2`. target is NULL, current is tbl_df Support present for following type of files: csv, xls, xlsx, docx, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc, pdf Note: ☰ These packages are required: tabulizer Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✖ ✖ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ Error in .jcall("java/lang/Class", "Ljava/lang/Class;", "forName", cl, : RcallMethod: cannot determine object class ══ testthat results ═══════════════════════════════════════════════════════════ [ OK: 171 | SKIPPED: 3 | WARNINGS: 0 | FAILED: 2 ] 1. Error: etc works 2. Failure: optional_package dependency test (@test-optional_package.R#25) Error: testthat unit tests failed Execution halted ```
RHub Console Output 4 ```console R Under development (unstable) (2020-05-07 r78381) -- "Unsuffered Consequences" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(testthat) > library(tidycells) > > > test_check("tidycells") ── 1. Error: etc works ──────────────────────────────────────────────────────── Java Exception .jcall(cell, "Lorg/apache/poi/ss/usermodel/RichTextString;", "getRichStringCellValue")new("jobjRef", jobj = , jclass = "java/lang/Throwable") Backtrace: 1. tidycells:::read_xls_from_xlsx(dex) 6. tidycells:::read_xls_for_tidycells(fn) 11. purrr::map(., for_a_sheet) 16. tidycells:::.f(.x[[i]], ...) 13. purrr::map(., xlsx::getCellValue) 26. xlsx:::.f(.x[[i]], ...) 28. rJava::.jcall(...) 29. rJava::.jcheck(silent = FALSE) ── 2. Failure: optional_package dependency test (@test-optional_package.R#25) ─ `d1` not equal to `d2`. target is NULL, current is tbl_df Support present for following type of files: csv, xls, xlsx, docx, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc, pdf Note: ☰ These packages are required: tabulizer Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✖ ✖ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ Error in .jcall("java/lang/Class", "Ljava/lang/Class;", "forName", cl, : RcallMethod: cannot determine object class ══ testthat results ═══════════════════════════════════════════════════════════ [ OK: 171 | SKIPPED: 3 | WARNINGS: 0 | FAILED: 2 ] 1. Error: etc works 2. Failure: optional_package dependency test (@test-optional_package.R#25) Error: testthat unit tests failed Execution halted  ```
bedantaguru commented 4 years ago
AppVeyor Console Output ```console packages 'utf8', 'data.table', 'pingr' are available as source packages but not as binaries Error: (converted from warning) packages 'utf8', 'data.table', 'pingr' are not available (as a binary package for R Under development) Execution halted Command exited with code 1 7z a failure.zip *.Rcheck\* packages 'utf8', 'data.table', 'pingr' are available as source packages but not as binaries Error: (converted from warning) packages 'utf8', 'data.table', 'pingr' are not available (as a binary package for R Under development) Execution halted Command exited with code 1 7z a failure.zip *.Rcheck\* ```

This is CI issue I think is possible to fix (see this)

bedantaguru commented 4 years ago
Travis Console Output ```console * installing *source* package ‘pkgbuild’ ... ** package ‘pkgbuild’ successfully unpacked and MD5 sums checked ** using staged installation ** R ** byte-compile and prepare package for lazy loading Error: .onLoad failed in loadNamespace() for 'processx', details: call: loadNamespace(name) error: there is no package called ‘ps’ Execution halted ERROR: lazy loading failed for package ‘pkgbuild’ * removing ‘/Users/travis/R/Library/pkgbuild’ Error in i.p(...) : (converted from warning) installation of package ‘pkgbuild’ had non-zero exit status Calls: ... with_rprofile_user -> with_envvar -> force -> force -> i.p Execution halted The command "Rscript -e 'deps <- remotes::dev_package_deps(dependencies = NA);remotes::install_deps(dependencies = TRUE);if (!all(deps$package %in% installed.packages())) { message("missing: ", paste(setdiff(deps$package, installed.packages()), collapse=", ")); q(status = 1, save = "no")}'" failed and exited with 1 during . ```

This is CI issue I think is possible to fix (see this)

bedantaguru commented 4 years ago
RHub Console Output (vignette) 1 ```console --- re-building ‘tidycells-intro.Rmd’ using rmarkdown Quitting from lines 446-457 (tidycells-intro.Rmd) Error: processing vignette 'tidycells-intro.Rmd' failed with diagnostics: no applicable method for 'select_' applied to an object of class "NULL" --- failed re-building ‘tidycells-intro.Rmd’ SUMMARY: processing the following file failed: ‘tidycells-intro.Rmd’ Error: Vignette re-building failed. Execution halted ```

Details

RHub Console Output (vignette) 2 ```console --- re-building 'tidycells-intro.Rmd' using rmarkdown Quitting from lines 298-314 (tidycells-intro.Rmd) Error: processing vignette 'tidycells-intro.Rmd' failed with diagnostics: there is no package called 'utf8' --- failed re-building 'tidycells-intro.Rmd' SUMMARY: processing the following file failed: 'tidycells-intro.Rmd' Error: Vignette re-building failed. Execution halted) ```

Details

bedantaguru commented 4 years ago

All winbuilder builds are successful

bedantaguru commented 4 years ago

Issue 'utf8'

https://github.com/r-rudra/tidycells/blob/429b9c9c500a0ad924094a84d3d7292b5251014c/R/as_cell_df.R#L25

https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-VA_classifier.R#L19

https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-read_cells.R#L4

https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-read_cells.R#L22

Vignette Issue

This possibly by {utf8} https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/vignettes/tidycells-intro.Rmd#L298

This may be avoided by safe dependency https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/vignettes/tidycells-intro.Rmd#L446

Tests

https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-read_cells.R#L75

This may be avoided by safe dependency https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-etc.R#L79

This may be avoided by safe dependency https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-optional_package.R#L25

bedantaguru commented 4 years ago

For {dplyr} S3 class needs to be fixed (ordering) see https://github.com/r-rudra/tidycells/issues/40

bedantaguru commented 4 years ago
Tests 1st Case : Console Output ```console R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(testthat) > library(tidycells) > > > test_check("tidycells") ── 1. Failure: read_cells: csv works (@test-read_cells.R#75) ────────────────── `dc0` not equal to `expct_d2`. 2/20 mismatches x[1]: "12" y[1]: "1.5" x[2]: "1.5" y[2]: "12" Support present for following type of files: csv, xls, xlsx, docx, pdf, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✔ ✔ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ Support present for following type of files: csv, xls, xlsx, docx, pdf, html Note: ☰ LibreOffice may be required for doc files ☰ Support is enabled for content type (means it will work even if the extension is wrong) ☰ Support not present for following type of files: doc Details: ┌──────────────────────────────────────────────┐ │ │ │ type package present support │ │ 1 csv{utils} utils ✔ ✔ │ │ 2 csv readr ✔ ✔ │ │ 3 xls{readxl} readxl ✔ ✔ │ │ 4 xls xlsx ✔ ✔ │ │ 5 xlsx tidyxl ✔ ✔ │ │ 6 doc docxtractr ✔ ✖ │ │ 7 docx docxtractr ✔ ✔ │ │ 8 pdf tabulizer ✔ ✔ │ │ 9 html XML ✔ ✔ │ │ │ └──────────────────────────────────────────────┘ ══ testthat results ═══════════════════════════════════════════════════════════ [ OK: 182 | SKIPPED: 2 | WARNINGS: 0 | FAILED: 1 ] 1. Failure: read_cells: csv works (@test-read_cells.R#75) Error: testthat unit tests failed Execution halted ```

That can happen for

testthat::expect_equal(c("12","1.5"), c("1.5","12"))

I don't know how that happened at max base::sort can be added in https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/tests/testthat/test-read_cells.R#L46

bedantaguru commented 4 years ago

safe dependency framework needs to be implemented

See https://github.com/r-rudra/tidycells/issues/41

bedantaguru commented 4 years ago

I don't know how to solve: Issue 'utf8' It's may need to see it again what exactly happening here (RHub)

bedantaguru commented 4 years ago

dplyr 1.0.0 is quite strict .. that's good I think But it is less fun to code in R then .. I think R's biggest advantage is that R is really flexible. It can understand you.

bedantaguru commented 4 years ago

For this https://github.com/ropensci/tabulizer/issues/106 Vignette Issue, 2 is occurring in https://github.com/r-rudra/tidycells/blob/bcecc73625c2208eaaef9fa6a7d0511ea4ed6ab2/vignettes/tidycells-intro.Rmd#L446 This is tracked using cloud_picker framework in CITestR Ref this Circle CI build

bedantaguru commented 4 years ago

dependency framework for {tabulizer} is also required