attic-labs / noms

The versioned, forkable, syncable database
Apache License 2.0
7.45k stars 267 forks source link

Panic during compression of large db #3767

Closed willhite closed 6 years ago

willhite commented 6 years ago

We're trying to get 1 billion rows of taxidata into noms on aws. We're at 397,500,000. Currently we commit after every 2.5 million rows. We've hit a point where we can't proceed due to an error compacting. Here are the steps to repro:

# noms and cvs-import are built on this machine from source code that is on this
# machine with git rev:   69759ba
# I believe one local mod has been made to the source to update the aws library
ssh -i ~/.ssh/attic-keypair.pem ec2-user@52.36.231.163
# you should be able to repro the problem using this command. The script extracts
# rows from  compressed csv files and splits them into 2.5 million row files and calls
# csv-import on them.
# -s 1000 -- don't limit the rows
# -k 397500000 -- skip the first n rows
# -a true -- append new rows to head of dataset
# nyctaxidata -- alias to aws://bucketdb-manifests:bucketdb-tables/p/nyctaxidata
bash -x ./noms-import.sh -s 1000 -k 397500000 -a true nyctaxidata::trips1g; date

Here's the stack trace that happens after about 45 minutes:

csv-import --header=id,vendor_id,pickup_datetime,dropoff_datetime,store_and_fwd_flag,rate_code_id,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude,passenger_count,trip_distance,fare_amount,extra,mta_tax,tip_amount,tolls_amount,ehail_fee,improvement_surcharge,total_amount,payment_type,trip_type,pickup,dropoff,cab_type,rain,snow_depth,snowfall,max_temp,min_temp,wind,pickup_nyct2010_gid,pickup_ctlabel,pickup_borocode,pickup_boroname,pickup_ct2010,pickup_boroct2010,pickup_cdeligibil,pickup_ntacode,pickup_ntaname,pickup_puma,dropoff_nyct2010_gid,dropoff_ctlabel,dropoff_borocode,dropoff_boroname,dropoff_ct2010,dropoff_boroct2010,dropoff_cdeligibil,dropoff_ntacode,dropoff_ntaname,dropoff_puma --column-types=Number,String,String,String,String,String,Number,Number,Number,Number,Number,Number,Number,Number,Number,Number,Number,Number,Number,Number,String,String,String,String,String,Number,Number,Number,Number,Number,Number,Number,Number,Number,String,Number,Number,String,String,String,Number,Number,Number,Number,String,Number,Number,String,String,String,Number --invert --append=true /tmp/csv-import/csv-import-ah.csv nyctaxidata::trips1g
        Error Trace:    try.go:99ic:
                        try.go:44
                        aws_table_persister.go:268
                        aws_table_persister.go:242
        Error:          RequestError: send request failed
                        caused by: Put https://bucketdb-tables.s3-us-west-2.amazonaws.com/rddt8b5mcgd1939h1oe1j97fplv7oiol?partNumber=1019&uploadId=vp12FEa85tfcS06Ia_bSZDpoqoCqRofnGqs1ro3u9_7mjfFGvli6y2hJzM4RXOz.e9BNd70qIIAECbQggYtBfHPvc9DukYv7I4sgbG85P6voVfm9dUwMguABiUrqIha2za_INqz3950Hzyw5DjJmNmU.fLimo7LKwC2jGTy0a8w-: dial tcp: lookup bucketdb-tables.s3-us-west-2.amazonaws.com on 172.31.0.2:53: dial udp 172.31.0.2:53: socket: too many open files
goroutine 1 [running]:
github.com/attic-labs/noms/go/d.PanicIfError(0x7fb2672e4d30, 0xc46ca0be80)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/d/try.go:44 +0x65
github.com/attic-labs/noms/go/nbs.awsTablePersister.executeCompactionPlan(0x17f74c0, 0xc42000e158, 0xc4201c3659, 0xf, 0xc42006fe60, 0x0, 0x0, 0xc4200a99e0, 0x500000, 0x500000, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/aws_table_persister.go:268 +0x227
github.com/attic-labs/noms/go/nbs.awsTablePersister.ConjoinAll(0x17f74c0, 0xc42000e158, 0xc4201c3659, 0xf, 0xc42006fe60, 0x0, 0x0, 0xc4200a99e0, 0x500000, 0x500000, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/aws_table_persister.go:242 +0x1db
github.com/attic-labs/noms/go/nbs.(*awsTablePersister).ConjoinAll(0xc4204faf50, 0xc4ced66000, 0xfd, 0xfd, 0xc4201c6000, 0xfd, 0xc4ced66000)
        <autogenerated>:106 +0xae
github.com/attic-labs/noms/go/nbs.conjoinTables(0x17ef5c0, 0xc4204faf50, 0xc4201b5800, 0xfd, 0xfd, 0xc4201c6000, 0x0, 0x0, 0x0, 0x6cd7662fa529e966, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/conjoiner.go:108 +0x2c8
github.com/attic-labs/noms/go/nbs.conjoin(0xc420378178, 0x4, 0x41a87cd4f0392ece, 0x315dd4d312d21b53, 0x482e2d8330f4b893, 0x2aa574287a919531, 0x8b0b602111872e8b, 0xc4201b5800, 0xfd, 0xfd, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/conjoiner.go:50 +0x832
github.com/attic-labs/noms/go/nbs.inlineConjoiner.Conjoin(0x100, 0xc420378178, 0x4, 0x41a87cd4f0392ece, 0x315dd4d312d21b53, 0x482e2d8330f4b893, 0x2aa574287a919531, 0x8b0b602111872e8b, 0xc4201b5800, 0xfd, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/conjoiner.go:41 +0xb8
github.com/attic-labs/noms/go/nbs.(*inlineConjoiner).Conjoin(0xc420123838, 0xc420378178, 0x4, 0x41a87cd4f0392ece, 0x315dd4d312d21b53, 0x482e2d8330f4b893, 0x2aa574287a919531, 0x8b0b602111872e8b, 0xc4201b5800, 0xfd, ...)
        <autogenerated>:173 +0xeb
github.com/attic-labs/noms/go/nbs.(*NomsBlockStore).updateManifest(0xc4201a4b40, 0xc9526034ef8ab9e, 0x3c6f529cd0b0251c, 0x482e2d837f39467a, 0x2aa574287a919531, 0x8b0b602111872e8b, 0x0, 0x0)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/store.go:439 +0x326
github.com/attic-labs/noms/go/nbs.(*NomsBlockStore).Commit(0xc4201a4b40, 0xc9526034ef8ab9e, 0x3c6f529cd0b0251c, 0x482e2d837f39467a, 0x2aa574287a919531, 0x8b0b602111872e8b, 0x800000)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/nbs/store.go:397 +0x1a1
github.com/attic-labs/noms/go/types.(*ValueStore).Commit.func1(0xc42014c4d0, 0xc9526034ef8ab9e, 0x3c6f529cd0b0251c, 0x482e2d837f39467a, 0x2aa574287a919531, 0x8b0b602111872e8b, 0x0)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/types/value_store.go:361 +0x437
github.com/attic-labs/noms/go/types.(*ValueStore).Commit(0xc42014c4d0, 0xc9526034ef8ab9e, 0x3c6f529cd0b0251c, 0x482e2d837f39467a, 0x2aa574287a919531, 0x8b0b602111872e8b, 0x4)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/types/value_store.go:370 +0x53
github.com/attic-labs/noms/go/datas.(*database).tryCommitChunks(0xc42036cc80, 0x17fca40, 0xc42026aa80, 0x7a919531482e2d83, 0x11872e8b2aa57428, 0xc48b0b6021, 0x4, 0x4)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/datas/database_common.go:212 +0x137
github.com/attic-labs/noms/go/datas.(*database).doCommit(0xc42036cc80, 0xc4201c3638, 0x7, 0x17ef580, 0xc42036cc80, 0xc46f417000, 0xbde, 0x1000, 0x0, 0x0, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/datas/database_common.go:172 +0x472
github.com/attic-labs/noms/go/datas.(*database).Commit.func1(0x17fb0e0, 0xc42036cc80, 0xc4201c3638, 0x7, 0x17f8dc0, 0xc421926080, 0x0, 0x0)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/datas/database_common.go:121 +0x162
github.com/attic-labs/noms/go/datas.(*database).doHeadUpdate(0xc42036cc80, 0x17fb0e0, 0xc42036cc80, 0xc4201c3638, 0x7, 0x17f8dc0, 0xc421926080, 0xc497da3580, 0x0, 0x
0, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/datas/database_common.go:247 +0x7c
github.com/attic-labs/noms/go/datas.(*database).Commit(0xc42036cc80, 0x17fb0e0, 0xc42036cc80, 0xc4201c3638, 0x7, 0x17f8dc0, 0xc421926080, 0x17f8dc0, 0xc42404f040, 0x
0, ...)
        /home/ec2-user/go/src/github.com/attic-labs/noms/go/datas/database_common.go:122 +0x127
main.main()
        /home/ec2-user/go/src/github.com/attic-labs/noms/samples/go/csv/csv-import/importer.go:222 +0x108c
+ exitcode=2
cmasone-attic commented 6 years ago

too many open files. Hm.

On Fri, Oct 20, 2017 at 9:19 AM Dan Willhite notifications@github.com wrote:

Assigned #3767 https://github.com/attic-labs/noms/issues/3767 to @cmasone-attic https://github.com/cmasone-attic.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/attic-labs/noms/issues/3767#event-1303321923, or mute the thread https://github.com/notifications/unsubscribe-auth/AMnImjlBG9KmZ8_igTAYheFjJx-WFIeMks5suMgfgaJpZM4QA5ti .