Open mdedetrich opened 3 years ago
Okay so I did a bit more research, basically this is going to be a bit of work to implement mainly because of the reason that COPY
is not a proper prepared statement its just raw SQL. This is only because COPY
works on a single table with basic filtering (so prepared statements are pointless). This is also the same reason why COPY
is so fast, it basically bypass entire parts of the Postgres engine that is not necessary.
After going through the code, it seems Quill doesn't natively have the ability to create SQL statements with input parameters that don't use prepared statements (basically Quill is built completely around prepared statements, there also appears to be an issue on this (see https://github.com/getquill/quill/issues/205)).
Ontop of this, COPY
is quite limited in what it can do, i.e. you can only COPY
from a single table and it seems that only basic WHERE
filtering is allowed. This would mean having to adjust the macro AST so that one can only create valid COPY
statements. This means having to do changes to the PostgresDialect
to create a completely new type of statement (i.e. copy[MyTable]
instead of query[MyTable]
?)
What are your thoughts @deusaquilus ? I am thinking maybe this makes sense for the next iteration of Quill? (unless there is an example somewhere were we use don't use prepared statements at all).
So I have a usecase at work where we need to input a huge amount of data into a Postgres dataset (talking about millions of rows here, if not more), perform a
SELECT
query and them stream the data back outWhen checking the Quill docs, I firstly noticed that streaming only applies when
SELECT
which isnt a problem and hence I am thinking that I just needed to implement the inserting of data via a stream.It turns out that Postgres has its own streaming support that is done via the
COPY
command, seeTypically this is used to stream data directly from files however postgres also lets you specify
STDIN
/STDOUT
as an input/output. JDBC implements this as aCopyManager
which allows various types ofInptStream
/OutputStream
/Reader
.Ideally at least when working with databases that support direct streaming (i.e. JDBC+Postgres) we should be using this to do our insert streaming since it actually means the output (for inserts) also does streaming.