choonghyunryu / dlookr

Tools for Data Diagnosis, Exploration, Transformation
https://choonghyunryu.github.io/dlookr/
208 stars 35 forks source link

Error: Can't recycle `as_tibble(result$table)[, 7:21]` (size 15) to size 17. #42

Closed davidfgeorge closed 3 years ago

davidfgeorge commented 3 years ago

Hi. I trust the all is well in your world. Just started to experiment with [dlookr] and experiencing the above error in title. The backtrace is: <error/vctrs_error_incompatible_size> Can't recycle as_tibble(result$table)[, 7:21] (size 15) to size 17. Backtrace: █

  1. ├─base::source("~/Documents/Documents/SOBA/ten80_testing/ten80_db_access_test_multi_client/ten80_BidPricing.R")
  2. │ ├─base::withVisible(eval(ei, envir))
  3. │ └─base::eval(ei, envir)
  4. │ └─base::eval(ei, envir)
  5. ├─test_Bids %>% describe() ~/Documents/Documents/SOBA/ten80_testing/ten80_db_access_test_multi_client/ten80_BidPricing.R:468:10
  6. ├─dlookr::describe(.)
  7. ├─dlookr:::describe.data.frame(.)
  8. │ └─dlookr:::describe_impl(.data, vars)
  9. │ └─base::lapply(...)
    1. │ └─dlookr:::FUN(X[[i]], ...)
    2. │ └─dlookr:::num_summary(pull(df, x))
    3. │ ├─base::[<-(...)
    4. │ └─tibble:::[<-.tbl_df(...)
    5. │ └─tibble:::tbl_subassign(x, i, j, value, i_arg, j_arg, substitute(value))
    6. │ └─tibble:::vectbl_as_new_col_index(j, x, value, j_arg, value_arg)
    7. │ └─tibble:::vectbl_recycle_rhs_names(names2(value), length(j), value_arg)
    8. │ ├─base::unname(vec_recycle(set_names(names), n, x_arg = as_label(value_arg)))
    9. │ └─vctrs::vec_recycle(set_names(names), n, x_arg = as_label(value_arg))
    10. └─vctrs:::stop_recycle_incompatible_size(...)
    11. └─vctrs:::stop_vctrs(...)

What is the cause of this? Cn it be column data with NA values? Thank you

choonghyunryu commented 3 years ago

Hi, @davidfgeorge

Thanks for your feedback.

It seems to be a problem in subscript operation.

If you share the data, it will help you solve the problem. If it is impossible, Can you share the "sapply(x, function(v) sum(complete.cases(v))" result?

I want to solve the problem through reproduction.

Thank you so much.

davidfgeorge commented 3 years ago

Hi. Here is some additional information: sapply( tickets, function(x) sum(complete.cases(x))) Id Number Title ShortDescription LongDescription SoftwareVersion FunctionalArea 3 3 3 3 3 0 1 SubFunctionalArea MinimumRating RequestType ticket_status expected_effort indicative_rate in_unit 3 3 3 3 3 3 2 ticket_live_date ticket_start ticket_end ClientProfileId CompanyName in_currency 3 3 3 3 3 3

tickets %>% glimpse() Rows: 3 Columns: 20 $ Id "FCBD158D-31F4-4C86-D430-08D90E05151D", "CE1001E5-A6E2-4FA5-872A-08D90503C424", "34985EA5-7B39-450F-A… $ Number 153, 148, 143 $ Title "SAP POSDM Functional Consultant ", "SAP Retail Senior Functional Consultant – 12 Month Contract", "S… $ ShortDescription "Permanent / 6 Month contract for a SAP POSDM Functional Consultant is available in one of Africa's l… $ LongDescription "

SAP POSDM Functional Consultant – 6 Month Contract

\n

… $ SoftwareVersion NA, NA, NA $ FunctionalArea NA, "SAP IS (Retail)", NA $ SubFunctionalArea "SAP POSDM", "Pricing, Procurement", "SAP HR and Workday" $ MinimumRating "50", "70", "80" $ RequestType "Time and Materials", "Fixed Price", "Fixed Price" $ ticket_status 2, 2, 2 $ expected_effort 180, 2080, 30 $ indicative_rate 600, 900, 0 $ in_unit "Days", "Hours", NA $ ticket_live_date 2021-05-03 07:28:44, 2021-04-21 20:26:38, 2021-02-17 13:55:21 $ ticket_start 2021-06-01, 2021-06-01, 2021-03-20 $ ticket_end 2021-11-30, 2022-05-31, 2021-08-31 $ ClientProfileId "EE7CD894-B5C1-4DA8-CB09-08D8C109AB1C", "EE7CD894-B5C1-4DA8-CB09-08D8C109AB1C", "EE7CD894-B5C1-4DA8-C… $ CompanyName "Company A", "Company B", "Company C" $ in_currency "ZAR", "ZAR", "ZAR"

tickets %>% diagnose()

A tibble: 20 x 6

variables types missing_count missing_percent unique_count unique_rate

1 Id character 0 0 3 1 2 Number integer 0 0 3 1 3 Title character 0 0 3 1 4 ShortDescription character 0 0 3 1 5 LongDescription character 0 0 3 1 6 SoftwareVersion character 3 100 1 0.333 7 FunctionalArea character 2 66.7 2 0.667 8 SubFunctionalArea character 0 0 3 1 9 MinimumRating character 0 0 3 1 10 RequestType character 0 0 2 0.667 11 ticket_status integer 0 0 1 0.333 12 expected_effort numeric 0 0 3 1 13 indicative_rate numeric 0 0 3 1 14 in_unit character 1 33.3 3 1 15 ticket_live_date POSIXct 0 0 3 1 16 ticket_start POSIXct 0 0 2 0.667 17 ticket_end POSIXct 0 0 3 1 18 ClientProfileId character 0 0 1 0.333 19 CompanyName character 0 0 1 0.333 20 in_currency character 0 0 1 0.333 **tickets %>% describe()** Error: Can't recycle `as_tibble(result$table)[, 7:21]` (size 15) to size 17. Hope this helps. Cheers.
choonghyunryu commented 3 years ago

Hi @davidfgeorge

Thanks for your bug report. I found the cause and corrected the error. The corrected version of the error was pushed on github. Try working with the github version. I will submit a patch version to CRAN soon.

Thank you so much.

davidfgeorge commented 3 years ago

Great! I can carry on experimenting. Enjoy your day.