basho-labs / riak-data-migrator

Riak logical export and data migration tool (using Java client)
15 stars 6 forks source link

timeout during dumping keys in large bucket #12

Open pallinger opened 10 years ago

pallinger commented 10 years ago

I'm trying to dump data from a bucket for migration: java -jar riak-data-migrator-0.2.6.jar -d -r ./ -b BUCKET -h IP -p 8087 -H 8098

The above operation exits with an exception:

Dumping bucket BUCKET
Exception in thread "main" java.lang.RuntimeException: com.basho.riak.pbc.RiakError: timeout
    at com.basho.riak.pbc.RiakStreamClient$1.hasNext(RiakStreamClient.java:100)
    at com.basho.riak.client.raw.pbc.PBClientAdapter$1.hasNext(PBClientAdapter.java:284)
    at com.basho.proserv.datamigrator.io.AbstractKeyJournal.populate(AbstractKeyJournal.java:56)
    at com.basho.proserv.datamigrator.BucketDumper.dumpBucket(BucketDumper.java:157)
    at com.basho.proserv.datamigrator.BucketDumper.dumpBuckets(BucketDumper.java:99)
    at com.basho.proserv.datamigrator.Main.runDumper(Main.java:470)
    at com.basho.proserv.datamigrator.Main.main(Main.java:116)
Caused by: com.basho.riak.pbc.RiakError: timeout
    at com.basho.riak.pbc.RiakConnection.receive(RiakConnection.java:125)
    at com.basho.riak.pbc.KeySource.get_next_response(KeySource.java:80)
    at com.basho.riak.pbc.KeySource.hasNext(KeySource.java:46)
    at com.basho.riak.pbc.RiakStreamClient$1.hasNext(RiakStreamClient.java:98)
    ... 6 more

During the above operation, it dumps a significant amount of keys in BUCKET/bucketkeys.keys (about 1.3M of 2.2M keys).

dmitrizagidulin commented 10 years ago

@pallinger - what version of Riak are you using? Unfortunately, there is a hardcoded 60 second timeout built into Riak's List Keys and List Buckets operations, in earlier versions of Riak (1.2 and below, at the very least).

pallinger commented 10 years ago

I'm using 1.4.8. I can actually dump most keys using curl (sometimes it omits some keys, and sometimes it hangs, but it mostly works). If I query keys using range queries (I am using leveldb), then I can always get all keys (or at least get a consistent answer :) ).