hrbrmstr / docxtractr

:scissors: Extract Tables from Microsoft Word Documents with R
Other
174 stars 29 forks source link

Error when assigning column names if the table has only one column #36

Open cmzambranat opened 1 year ago

cmzambranat commented 1 year ago

Hi there, thanks for the package, very useful!

I get the following error when assigning a row as a column name if the scraped Word table has only one column.

Error in names[old] <- names(x)[j[old]] : replacement has length zero

Here is a potential solution that I was able to use:

assign_colnames_v2 = function (dat, row, remove = TRUE, remove_previous = remove) 
{
  if ((row > nrow(dat)) | (row < 1)) {
    return(dat)
  }

  # Save the original class of 'dat' to reassign later
  d_class <- class(dat)

  # Convert to data frame to ensure consistent handling
  dat <- as.data.frame(dat, stringsAsFactors = FALSE)

  # Check if 'dat' has only one column
  if (ncol(dat) == 1) {
    # Special handling for one-column data frame
    col_name <- as.character(dat[row, 1])

    # Remove the row that is now the column name, if required
    if (remove) {
      if (remove_previous) {
        dat <- dat[(row+1):nrow(dat), , drop = FALSE]
      } else {
        dat <- dat[-row, , drop = FALSE]
      }
    }

    # Set the column name
    colnames(dat) <- col_name
  } else {
    # For data frames with more than one column, use the original approach
    colnames(dat) <- as.character(unlist(dat[row, ]))

    # Determine rows to remove
    start <- row
    end <- row
    if (remove_previous) {
      start <- 1
    }

    # Remove the rows
    dat <- dat[-(start:end), , drop = FALSE]
  }

  # Reset the row names
  rownames(dat) <- NULL

  # Reassign the original class, especially if 'dat' was a tibble
  class(dat) <- d_class

  # Return the modified data frame
  return(dat)
}

Hope this is useful for other people as well.

C