DillonHammill / DataEditR

An Interactive R Package for Viewing, Entering Filtering and Editing Data
https://dillonhammill.github.io/DataEditR/
384 stars 40 forks source link

data_edit() does not appropriately handle empty datasets #34

Closed DillonHammill closed 3 years ago

DillonHammill commented 3 years ago

We cannot set rownames on a zero row dataset, hence why the following example causes data_edit() to crash:

library(DataEditR)
test <- matrix(ncol = 10, 
               nrow = 0, 
               dimnames = list(NULL, paste0("A", 1:10)))
data_edit(test)

Need to do a check on row numbers when annotating rownames, perhaps we can add an empty row so that rownames can be set.

DillonHammill commented 3 years ago

So, we can add an empty row to the data within the DataEdit module to allow us to load the data. Since the data is in matrix format, there is no type conversion performed so any numerics entered will be treated as character strings. This is because we have to place empty characters strings in the empty row - if we set these values to NA then the edits are not saved (probably due to the logical conversion step within rhandsontable). This is probably the best fix that we have for now but it does cause other problems, if we attempt to add a new row after the empty one, data_edit() crashes with the following error:

Listening on http://127.0.0.1:4142
Warning: Error in genColHeaders: Change no recognized:afterChange
  3: shiny::runApp
  2: runGadget
  1: data_edit [E:/Documents/R-Projects/DataEditR/R/data_edit.R#491]

I will need to take a closer look at this before committing to any changes, for now we will leave the DataEditR code as is.

DillonHammill commented 3 years ago

After debugging this in rhandsontable the error arises due to the data being duplicated for some reason when a new row is added. Basically an afterChange event is triggered which results in data with excess columns (due to duplication) which triggers the genColHeaders() function in rhandsontable and gives rise to the error. I suspect that this is a similar issue to the one I previously faced for genRowHeaders() where the code was being created by an afterChange event - I was able to patch that but I don't think a patch will work for genColHeaders(). Instead, we need to figure out why the data is being duplicated in the first place and fix the problem there - unfortunately this is likely to be with the javascript code here.

DillonHammill commented 3 years ago

OK, I tracked down the bug to rhandsontable(). Turns out there is a loop that is run when useTypes = FALSE (as used by DataEditR) which formats the data before it is used by htmlwidgets::createWidget. This issue occurs when you try to call lapply on an empty matrix it returns individual elements instead of a vector per column - as a result the data is actually coerced into a long vector which changes the dimensions of the data. The solution is to re-write the lappy loop to specifically pull out columns - this change has been added to rhandsontable.

DillonHammill commented 3 years ago

I have followed this up with additional commits to dataInput and dataEdit to ensure that the supplied data always has at least one row and column. This update is only available on GitHub at the moment and requires the changes I made to my rhandsontable fork as well:

devtools::install("DillonHammill/rhandsontable")
devtools::install_github("DillonHammill/DataEditR")

I will upload the new version DataEditR (v0.1.4) with these changes to CRAN as soon as I get some time.

keithomayot commented 3 years ago

I'm getting this issue when I try to upload an xlsx sheet with date columns which I suspect might be causing the issue Warning in if (class(data[, x]) == "Date") { : the condition has length > 1 and only the first element will be used Warning: Error in as: no method or default for coercing “character” to “NA” I updated the two packages as you'd specified above.