Open carlosresu opened 4 months ago
vroom::vroom_lines is unable to ignore quoted newlines when trying to count lines of a ISO-8859-1 encoded csv that contains quoted newlines.
library(data.table) total_rows <- fread(full_claims_file(part), select = 1L, header = TRUE)[, .N] print(paste("Total Rows via fread:", total_rows)) library(vroom) # Function to count rows using vroom count_rows_vroom <- function(file_path) { total_lines <- length(vroom_lines(file_path, altrep = TRUE, progress = FALSE)) return(total_lines - 1L) # subtract 1 for the header } # Use the function to count total rows total_rows <- count_rows_vroom(full_claims_file(part)) print(paste("Total Rows via vroom_lines:", total_rows))
It counts the following: [1] "Total Rows via fread: 11777674" [1] "Total Rows via vroom_lines: 11801846"
vroom::vroom_lines is unable to ignore quoted newlines when trying to count lines of a ISO-8859-1 encoded csv that contains quoted newlines.
It counts the following: [1] "Total Rows via fread: 11777674" [1] "Total Rows via vroom_lines: 11801846"