Since version 12, PostgreSQL has supported a WHERE clause within COPY ... FROM statements:
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
[ WHERE condition ]
It functions similarly to the WHERE clause used within SELECT statements, subject to some restrictions regarding subqueries.
Recently I've found myself in a situation where I am performing a bulk insert of CSV-formatted data from a remote machine, but would like to skip certain rows - such as those containing null values. From what I can tell, the existing Connection.copy_to_table helper doesn't support this, and I'd need to drop down to private API (the underlying Connection._protocol._copy_in machinery)
I'd appreciate some publicly exposed way of doing this - the simplest approach that comes to mind would be adding a where kwarg to Connection.copy_to_table as well as the Connection._format_copy_opts method. Usage could then look something along the lines of
I'm not sure if this is satisfactory, though. There would also need to be some consideration regarding earlier Postgres versions that don't support COPY ... FROM using a WHERE clause.
Please let me know your thoughts - I'm happy to open a PR for implementation if necessary.
Hi there!
Since version 12, PostgreSQL has supported a
WHERE
clause withinCOPY ... FROM
statements:It functions similarly to the
WHERE
clause used withinSELECT
statements, subject to some restrictions regarding subqueries.Recently I've found myself in a situation where I am performing a bulk insert of CSV-formatted data from a remote machine, but would like to skip certain rows - such as those containing null values. From what I can tell, the existing
Connection.copy_to_table
helper doesn't support this, and I'd need to drop down to private API (the underlyingConnection._protocol._copy_in
machinery)I'd appreciate some publicly exposed way of doing this - the simplest approach that comes to mind would be adding a
where
kwarg toConnection.copy_to_table
as well as theConnection._format_copy_opts
method. Usage could then look something along the lines ofI'm not sure if this is satisfactory, though. There would also need to be some consideration regarding earlier Postgres versions that don't support
COPY ... FROM
using aWHERE
clause.Please let me know your thoughts - I'm happy to open a PR for implementation if necessary.