Open pkoppstein opened 3 weeks ago
Reading /dev/stdin
is tricky. read_text()
reads the content in VARCHAR and DuckDB's VARCHAR is limited to uint32_t. In order to check the size limit, read_text()
checks the size and reads the file till that.
But /dev/stdin's size
is zero.
% ls -l /dev/stdin
lrwxrwxrwx 1 root root 15 Nov 8 07:58 /dev/stdin -> /proc/self/fd/0
read_csv
doesn't have this issue since it doesn't put all into one column.
@kzys - thanks for the response. Since I’d really like to see some good alternatives to reading text files that do not rely on read_csv(), I think it would be worthwhile expanding the discussion a bit to include e.g. the possibility of adding a line-by-line mode for read_text(). If such a mode were introduced, then it would be reasonable for read_text() to require it if presented with STDIN.
What happens?
This might technically be an Enhancement Request, but since read_csv() understands '/dev/stdin' and read_text() appears not to, it certainly seems more like a bug.
BUT:
Also, please note that attempting to read both the program and its data from stdin produces weird results:
To Reproduce
OS:
MacOS
DuckDB Version:
v1.1.2-dev218 │
DuckDB Client:
CLI
Hardware:
No response
Full Name:
Peter Koppstein
Affiliation:
Princeton University
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?