Closed derekmahar closed 3 years ago
Those DBMS have the feature to import csv. Why do you need it?
Yes, you are correct. However, I am considering the case where you might generate CSV content in a pipeline that could terminate with sqlite3
. Unfortunately, we cannot implement such a pipeline because sqlite3
accepts only SQL statements on standard input and csvq
does not generate SQL output. As a consequence, we must first save the CSV content to a temporary file or FIFO and run a separate SQLite script to import the CSV file. It would be more convenient to output the CSV content as SQL INSERT INTO
statements and pipe these directly into sqlite3
, bypassing the creation of a temporary file or FIFO and the creation and execution of the sqlite3
import script.
I don't know why you don't want to use temporary files or fifo, but anyway, if you're using unix or unix-like os, can't you specify /dev/stdin as the import file for sqlite?
I think that won't work because sqlite3
accepts only SQL scripts (SQL statements and its own shell commands) on standard input. How then could sqlite3
accept both the SQL script and CSV content on standard input?
At least on my environment, macOS 11 and sqlite3 version 3.32.3, I can run it like this.
Yes, you are right. I just did the same thing. I hadn't realized that when provided a script as an argument, sqlite3
would not process its standard input as a script and could instead read and process other input data. It's also interesting that in the command line script, command .quit
need not follow .import
which is the case when you supply the script on standard input.
Do psql
(PostgreSQL) and mysql
(MySQL) also have this behaviour?
How would one implement this pipeline in Windows?
I don't know why you don't want to use temporary files or fifo
Using temporary files is more verbose and cumbersome than simply chaining together all commands in a pipeline.
I can't test mysql or sql right away, but you can execute mysql or sql commands directly as you can with sqlite. And you probably can't do the same thing on Windows.
In any case, using temporary files is not a complicated task, and if you need to do it over and over again, you can just write a simple script.
And you probably can't do the same thing on Windows.
Yes, Windows does not seem to have a special file name for standard input equivalent to /dev/stdin
or -
in Unix.
In any case, using temporary files is not a complicated task, and if you need to do it over and over again, you can just write a simple script.
Yes. For example, I just wrote a bash script called csv_to_sqlite
:
#!/bin/bash
database_file_name=$1
table_name=$2
temp_input_file_name=<(cat)
sqlite3 $database_file_name ".import --csv $temp_input_file_name $table_name"
Note that <(cat)
invokes cat
which reads from the standard input and redirects its standard output to a temporary file which the system creates.
Test:
$ printf "col1,col2\na,1\nb,2\n" | ./csv_to_sqlite test.db test
$ sqlite3 -csv -header test.db "SELECT * FROM test"
col1,col2
a,1
b,2
If we would implement features for external applications, we will need to keep up with their updates. That will require a lot of work. Since there are widely used formats such as csv, it is better to think about how to pass them on in each individual situation.
Yes, I agree, though in the case of generating SQL output to import a CSV, only the CREATE TABLE
statement might differ between database implementations. The INSERT INTO
statements would likely be standard.
You could just generate the INSERT INTO
statements.
Enclosures of tables and fields may be backquotes, or it may be double-quotes. Depending on the number of data, bulk inserts may be required for actual use, but it is not supported by some databases. It is not easy to cover everything that varies from database to database and user environment to user environment.
In the first place, if create table statements don’t be exported, you will end up needing to work on the database separately.
Okay, I concede that generating SQL statements is complicated by differing database SQL conventions and that it makes sense to instead use an import tool specific to the database. The only case I can think of where a generic tool that generates standard SQL statements would be useful would be where a database didn't have a tool to import a CSV file. SQLite, PostgreSQL, and MySQL can all import CSV files into tables, though.
Note that
<(cat)
invokescat
which reads from the standard input and redirects its standard output to a temporary file which the system creates.
Unfortunately, while <(cat)
works in bash on FreeBSD, it does not work in Linux. I'm still seeking a solution that doesn't require directly reading /dev/stdin
.
The following version works in Linux and FreeBSD:
#!/bin/bash
function csv_to_sqlite() {
local database_file_name="$1"
local table_name="$2"
local temp_input_fifo=$(mktemp -u)
mkfifo $temp_input_fifo
sqlite3 -csv $database_file_name ".import $temp_input_fifo $table_name" &
cat > $temp_input_fifo
rm $temp_input_fifo
}
database_file_name=$1
table_name=$2
csv_to_sqlite "$database_file_name" "$table_name"
I found a simpler solution that still uses sqlite3
.import
, but that doesn't read /dev/stdin
or a temporary named pipe. Instead, it uses .import
with the pipe operator to invoke cat -
to read directly from standard input.
#!/bin/bash
function csv_to_sqlite() {
local database_file_name="$1"
local table_name="$2"
sqlite3 -csv $database_file_name ".import '|cat -' $table_name"
}
database_file_name=$1
table_name=$2
csv_to_sqlite "$database_file_name" "$table_name"
Please add SQL as an output format to serve as input to SQLite and possibly other relational database engines like PostgreSQL and MySQL. The output for a given CSV file would begin with a
CREATE TABLE
statement followed by one or moreINSERT INTO
statements to recreate the contents of the CSV file as a database table in SQLite.