citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Generalize COPY-to-INSERT trigger #61

Closed jasonmp85 closed 9 years ago

jasonmp85 commented 9 years ago

I've written a COPY trigger used by some of our customers which turns a COPY command into many INSERT commands using a temporary table with the same schema as a sharded table.

Unfortunately it's hardcoded to a specific schema. The first step towards users being able to COPY to a sharded table is to generalize this script for any table.

rsolari commented 9 years ago

I just wanted to note that there is a concurrency bug in the script version that I have been using.

The problem is that the process table, pg_proc, isn't MVCC-safe. When we do REPLACE FUNCTION, we invalidate the cached copies of all of the old versions of the function, and in-flight copies fail.

This problem only occurs when we are doing REPLACE FUNCTION calls, so if you create the function once, you'll be fine. I just wanted to make sure you're aware of the concurrency issue before you productize the trigger.

jasonmp85 commented 9 years ago

@rsolari — Does your copy of this script have a call to pg_advisory_lock? It was added to guard the CREATE OR REPLACE call because concurrent modifications caused problems…

Or is this a separate issue? It sounds as though you're saying the REPLACE call trips up in-flight executions of the trigger that were otherwise happy…

rsolari commented 9 years ago

@rsolari — Does your copy of this script have a call to pg_advisory_lock? It was added to guard the CREATE OR REPLACE call because concurrent modifications caused problems…

Yes, there is a lock around CREATE OR REPLACE.

Or is this a separate issue? It sounds as though you're saying the REPLACE call trips up in-flight executions of the trigger that were otherwise happy…

Yep, that's what's happening. Each local process' cached version of the function gets invalidated, and in-flight copies fail.

jasonmp85 commented 9 years ago

@rsolari So the current (short-term) approach is to make parallel use safe, but still have the same failure mode (failure meaning the COPY failed because of something beyond our control, not because of a bug in parallel access).

I was imagining you could do something like:

  1. Get a huge file to ingest
  2. Split it into n parts
  3. For each part, create a separate writer (COPY) process
  4. Each writer uses COPY to ingest its file
  5. If a file fails partway through (i.e. it ingests only 500 of 1000 lines), skip past the successful lines and retry
  6. After m retries of a file that has failed, take some other action (raise an error or skip a row?)

Assuming we provide a multiprocess-safe COPY-compatible function that returns the number of rows successfully copied, what are you missing? Is your desired workflow significantly different from the above?

rsolari commented 9 years ago

That workflow sounds like exactly what we want.

jasonmp85 commented 9 years ago

Hey @rsolari — I know you guys had some issues with the existing script apart from what you've said here, namely:

The pull request (#82) I opened has a script that allows relative paths and supports most OPTIONS for COPY, but I was wondering if you also need the ability to explicitly specify what columns are in the input (if, for instance, you want to omit certain columns in your input file). This feature shows up in the COPY syntax as the ( column_name [, ...] ) clause. Do you need support for this right now?

rsolari commented 9 years ago

Thanks for checking in. We don't need support for specifying columns right now.

We only need support for specifying FORMAT as text, the DELIMITER, and the NULL character. Here's our COPY:

COPY my_table FROM :'filename' WITH(FORMAT text, DELIMITER ',', NULL '\N');

I looked over #82, and it looks like all of things we'd want supported are supported, which is awesome.