alexsanjoseph / compareDF

R Tool to compare two data.frames
Other
93 stars 17 forks source link

Unable to read column names with spaces & problem when comparing strings #24

Closed lindali97 closed 5 years ago

lindali97 commented 5 years ago

Hello Alex, Thank you for your wonderful package! It has been very helpful for me and easy to use, but I would like to raise two likely bugs when using it.

  1. When trying to compare dataframes(named newdata and olddata, as shown beneath) with spaces in column name, it raises an error message like this(with traceback attached):

    
    Error in parse(text = x) : :1:11: unexpected symbol
    1: Reporting Type
    In addition: Warning message: Error in parse(text = x) : :1:11: unexpected symbol 
    
  2. parse(text = x)

  3. parse_exprs(x)

  4. parse_expr(x)

  5. new_quosure(parse_expr(x), as_environment(env))

  6. parse_quo(lazy[[1]], env)

  7. compat_lazy(dots[[i]], env, warn)

  8. compat_lazy_dots(.dots, caller_env(), ...)

  9. groupby.data.frame(structure(list(goal = c("2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "15", "15", "15", "15", "15", "15", "15", "15", "15", "15", "15", "15", "15", "15", ... 12.(function (.data, ..., .dots = list(), add = FALSE) {signal_soft_deprecated(paste_line("groupby() is deprecated. ", "Please use group_by() instead", "", "The 'programming' vignette or the tidyeval book can help you", ...

  10. do.call(fname, c(list(x), largs)) 10.piped.do.call(., groupby, group_col) 9.function_list[i] 8.freduce(value, _function_list) 7._fseq(_lhs) 6.eval(quote(_fseq(_lhs)), env, env) 5.eval(quote(_fseq(_lhs)), env, env) 4.withVisible(eval(quote(_fseq(_lhs)), env, env)) 3.df_combined %>% piped.do.call(groupby, group_col) %>% data.frame(grp = group_indices(.), .) %>% ungroup 2.group_columns(both_tables, group_col)

  11. compare_df(newdata, olddata, series_identifier, limit_html = 100, stop_on_error = FALSE, keep_unchanged_rows = FALSE, exclude = exclude_cols)

  12. The returned compare table seemed to report some identical cells(string, identical (cell1, cell2) == TRUE) as different.

Could you take a look at those problems? Thank you!

alexsanjoseph commented 5 years ago

Hi @lindal97 - Thanks for using the package

Regarding (1) - It is not generally a good practice to use spaces in column names in R. Base R doesn't even allow you to put spaces in a data.frame while creating using the data.frame command (converts it into periods). This can cause weird error in other packages as well. If you absolutely want to have spaces in your column names, it might be a good idea to convert that to Underscores or periods and do the computation and then convert back just before printing/converting to your favorite format.

Can you give a minimal reproducible example for (2)?

lindali97 commented 5 years ago

Hi @alexsanjoseph, Thank you for the reply & sorry for the gap between... After tweaking around for a little bit, I realized the origin of (2) was some encoding issues of the original file that for some reasons was overlooked/automatically transformed by R. Thank you for your timely response & the wonderful package!