citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Large, unbounded queries cause crashes #77

Closed jasonmp85 closed 9 years ago

jasonmp85 commented 9 years ago

In #70, @jberkus says:

BTW, right now if you try a simple limit query like the above, you get a client termination because it's trying to drag the entire database from all shards into memory:

psocial=# select * from members limit 10;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 101571.258 ms
!> 
!> \q

pg_shard does not try to drag the entire database into memory: it pulls data from each shard into a tuplestore and then loads that into a temporary table (see #32 for some discussion of this). So something else is causing this crash.

Could be related to the data set size in relation to work memory, perhaps. Whatever the case we should figure it out and decide whether to get a hotfix in for this release.

jasonmp85 commented 9 years ago

Optimistically putting this in the v1.1 milestone in the hope we'll find a quick fix.

jberkus commented 9 years ago

Could it be as simple as the temp table exceeding configured limits?

jasonmp85 commented 9 years ago

This looks like a memory leak. It's less about the number of rows or shards and more about the total amount of data being processed. I was able to get pretty high memory use on my machine (> 10GB) out of 50M rows or so, but the rows were just UUIDs and integers. I'm assuming your rows were JSON and probably had much more data.

I think we have a fix for this, going to get @onderkalaci to look into the root cause a bit more.

onderkalaci commented 9 years ago

@jberkus is postgres crashing by itself, or some other process kills it because of high memory usage? I could observe high memory usage, but not the crash.

jasonmp85 commented 9 years ago

If Linux, possibly the OOM killer.

I tried to get OS X to crash it before it used all my system memory, but ulimit -v doesn't do what you'd want on OS X apparently.

—Jason

On Wed, Mar 4, 2015 at 11:34 AM, Önder Kalacı notifications@github.com wrote:

@jberkus https://github.com/jberkus is postgres crashing by itself, or some other process kills it because of high memory usage? I could observe high memory usage, but not the crash.

— Reply to this email directly or view it on GitHub https://github.com/citusdata/pg_shard/issues/77#issuecomment-77216944.