r-rudra / tidycells

Automatic transformation of untidy spreadsheet-like data into tidy form
https://r-rudra.github.io/tidycells/
Other
83 stars 10 forks source link

Rcpp Adoption in various stages #26

Open bedantaguru opened 4 years ago

bedantaguru commented 4 years ago

Need to adopt {Rcpp} whenever possible.

bedantaguru commented 4 years ago

LCS implemented in Rcpp

bedantaguru commented 4 years ago

I noticed that Rcpp may not be optimal for large data in as_cell_df

bedantaguru commented 4 years ago

Maybe I need to check memory status by memory.limit()

bedantaguru commented 4 years ago
expr min lq mean median uq max neval
unpivotr::as_cells(df) %>% tidycells::as_cell_df() 634.28278 683.8018 778.8232 781.1177 881.1816 910.1206 10
tidycells::as_cell_df(df, take_col = F) %>% tidycells::as_cell_df() 809.61787 997.8078 993.7125 1007.7760 1031.1863 1057.0500 10
as_cell_df_c2(df) %>% tidycells::as_cell_df() 99.25656 104.2768 133.2668 113.7746 120.1005 331.1366 10
as_cell_df_r(df) %>% tidycells::as_cell_df() 117.50674 123.1997 133.3697 128.3014 136.8042 179.8280 10
as_cell_df_r2(df) %>% tidycells::as_cell_df() 107.92256 109.7794 116.9087 117.5692 121.1555 130.4172 10
bedantaguru commented 4 years ago

as_cell_df_r2 may be a good option to speedup as_cell_df

bedantaguru commented 4 years ago

enhead alternative is not in Rcpp

bedantaguru commented 4 years ago

Check out tidycells_nightly@Rcpp-dep

While LCS is great, is_attachable is poor as compared to R. Maybe my recent knowledge of C++ is not adequate enough to implement the performant version. Also in my opinion heuristic of this level is better to be kept with R.

As of now, there is no way to save Rcpp::cppFunction(). The only option is to create a package.

A package directly can't have an optional dependency on {Rcpp}. [It has to be in Imports at least behaviorally]

Hence the best idea is to remove the dependency. LCS is required in name_suggest which is a small and experimental portion of the package.

Implement https://github.com/r-rudra/tidycells/issues/36

bedantaguru commented 4 years ago

source code of LCS can be ported as optional module