Closed gnazarov-kabam closed 8 years ago
Thanks for the report. Could you post the error messages? It's very unlikely that I will be able to debug this.
Thanks for even looking into it :) ! I realized that error message would have been useful but was already away from computer at the time.
So the error comes in at around 2 hours and 12 mins mark for my system, while manifest has already been created around 1:30 mark, obviously unload timing is table-size dependent. I'm worried also that it could cause issues on tables that take longer than 2 hours to unload from redshift.
Here's the error it times out with with:
I, [2016-07-18T14:55:54.082260 #16299] INFO -- : Unloading Redshift table shenqu_counter to s3://bigshift/bucket/table/etc/etchere
/usr/local/share/gems/gems/bigshift-0.3.1/lib/bigshift/redshift_unloader.rb:24:in `exec': SSL SYSCALL error: Connection timed out (PG::UnableToSend)
from /usr/local/share/gems/gems/bigshift-0.3.1/lib/bigshift/redshift_unloader.rb:24:in `unload_to'
from /usr/local/share/gems/gems/bigshift-0.3.1/lib/bigshift/cli.rb:50:in `unload'
from /usr/local/share/gems/gems/bigshift-0.3.1/lib/bigshift/cli.rb:28:in `run'
from /usr/local/share/gems/gems/bigshift-0.3.1/bin/bigshift:6:in `<top (required)>'
from /usr/local/bin/bigshift:23:in `load'
from /usr/local/bin/bigshift:23:in `<main>'
I, [2016-07-18T17:07:31.079978 #21429] INFO -- : Unloading Redshift table shenqu_economy to
OOoh after some more digging. I've stumbled onto this: http://stackoverflow.com/questions/26290382/long-running-redshift-transaction-from-ruby which I think should hold the answer.
Looks like it could be the same issue. I'm on vacation at the moment, so I'll have to see when I've got time to implement the workaround. Looks easy enough though.
I've released v0.3.2 with a fix, could you test it out and see if it solves your problem?
Hi.
I've recently transferred about 250GB of data from RS to BQ. Here are some findings that tripped me up a bit.
1) When dealing with about 100Gb+ tables from RS, bigshift takes over an hour or so to unload them, it also seems to be timing out on S3. After about 2 hours bigshift reports a full S3 timeout while the table has been unloaded about 30 mins before that. (Unload time varies by table size, bigshift timeout seems to be exactly same).
2) I have a table with 2 records in RS. Intermittently bigshift has issues with transferring it no GC side. Most of the time transfer doesn't finish and gets stuck, sometimes it reports that not all files transferred. Not sure on that one. Maybe just buggy google side.
Using Centos7. Bigshift 3.1 ruby 2.0.0p598 (2014-11-13) [x86_64-linux]