tidyverse / vroom

Fast reading of delimited files
https://vroom.r-lib.org
Other
622 stars 60 forks source link

vroom fails to guess delimiter in trivial cases #474

Open cboettig opened 1 year ago

cboettig commented 1 year ago

Consider the following reprex:

f <- tempfile()
writeLines("X", f)
vroom::vroom(f)
#> Error: Could not guess the delimiter.
#> 
#> Use `vroom(delim =)` to specify one explicitly.

It's understandable why vroom fails to guess a delimiter in this case, but it seems like the failure could be avoided without introducing adverse effects e.g. by assuming that the data is a single column when vroom is asked to detect a delimiter and doesn't find any delimiter match.

I understand that it's generally considered best practice to specify the delimiter, but vroom uses very sensible defaults that work out of the box in most cases. This seems like a simple enough case to me that should fall within the scope of vroom's default behavior capability. (No doubt we could construct edge cases where failure would be better than guessing the data is a single column, but then that is also true of any reliance on guessing. Generally I think it makes sense that functions that make guesses err on the side of trying not to fail, even at the risk of not parsing correctly, whereas failure is preferred when the user explicitly sets expectations about delimiters that aren't met).

For context, I'm encountering this issue in using vroom internally to access a variety of tabular data that are usually easily within vroom's scope, but some new files just entered that database that use this single-column format and thus break the reading pipeline).

NJU-Bio-Info commented 1 year ago

Actually same question. Why vroom can not read single column file automatically?