GraphChi / graphchi-cpp

GraphChi's C++ version. Big Data - small machine.
https://www.usenix.org/system/files/conference/osdi12/osdi12-final-126.pdf
803 stars 312 forks source link

Unclear error message when system is out of file descriptors #10

Open dbickson opened 10 years ago

dbickson commented 10 years ago

./toolkits/collaborative_filtering/itemcf --distance=4 --training=dataset --K=350 --quiet=0 --nshards=1 --clean_cache=1 --min_allowed_intersection=48 --allow_zeros=1 WARNING: common.hpp(print_copyright:195): GraphChi Collaborative filtering library is written by Danny Bickson (c). Send any comments or bug reports to danny.bickson@gmail.com [distance] => [4] [training] => [dataset] [K] => [350] [quiet] => [0] [nshards] => [1] [clean_cache] => [1] [min_allowed_intersection] => [48] [allow_zeros] => [1] WARNING: chifilenames.hpp(find_shards:271): Could not find shards with nshards = 1 WARNING: chifilenames.hpp(find_shards:272): Please define 'nshards 0' or 'nshards auto' to automatically detect. INFO: sharder.hpp(start_preprocessing:370): Starting preprocessing, shovel size: 17476266 INFO: io.hpp(compute_matrix_size:136): Starting to read matrix-market input. Matrix dimensions: 5466961 x 15325, non-zeros: 125225326 INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.0.shovel, max:5475672 INFO: sharder.hpp(flush:193): Sort done.dataset4.0.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.1.shovel, max:5477316 INFO: sharder.hpp(flush:193): Sort done.dataset4.1.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.2.shovel, max:5477394 INFO: sharder.hpp(flush:193): Sort done.dataset4.2.shovel INFO: sharder.hpp(flush_shovel:407): Too many outstanding shoveling threads... INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.3.shovel, max:5479571 INFO: sharder.hpp(flush:193): Sort done.dataset4.3.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.4.shovel, max:5481431 INFO: sharder.hpp(flush:193): Sort done.dataset4.4.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.5.shovel, max:5481431 INFO: sharder.hpp(flush:193): Sort done.dataset4.5.shovel INFO: sharder.hpp(flush_shovel:407): Too many outstanding shoveling threads... INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.6.shovel, max:5481997 INFO: sharder.hpp(flush:193): Sort done.dataset4.6.shovel INFO: io.hpp(convert_matrixmarket:564): Global mean is: 317.101 Now creating shards. INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.7.shovel, max:5482284 INFO: sharder.hpp(flush:193): Sort done.dataset4.7.shovel INFO: sharder.hpp(flush_shovel:401): Waiting shoveling threads... INFO: io.hpp(convert_matrixmarket:578): Now creating shards. INFO: sharder.hpp(determine_number_of_shards:586): Number of shards to be created: 1 INFO: sharder.hpp(write_shards:967): Edges per shard: 125225327 nshards=1 total: 125225326 INFO: sharder.hpp(write_shards:973): Buffer size in merge phase: 52428804 INFO: sharder.hpp(finish_shard:625): Starting final processing for shard: 0 DEBUG: sharder.hpp(finish_shard:636): Shovel size:1502703912 edges: 125225326 DEBUG: sharder.hpp(finish_shard:702): 0 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 10000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 20000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 30000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 40000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 50000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 60000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 70000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 80000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 90000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 100000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 110000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 120000000 / 125225326 INFO: sharder.hpp(createnextshard:887): Remaining edges: 0 remaining shards:0 edges per shard=125225327 INFO: sharder.hpp(createnextshard:890): Edges per shard: 125225327

=== REPORT FOR sharder() === [Timings] edata_flush: 0.352861s (count: 478, min: 0.000675s, max: 0.001248, avg: 0.000738203s) execute_sharding: 21.844 s finish_shard.sort: 9.24113 s preprocessing: 55.024 s shard_final: 14.7475 s [Other] app: sharder INFO: sharder.hpp(done:903): Created 1 shards, for 125225326 edgesSuccessfully finished sharding for dataset INFO: io.hpp(convert_matrixmarket:583): Created 1 shards. INFO: itemcf.cpp(main:514): M = 5466961 DEBUG: stripedio.hpp(stripedio:271): Start io-manager with 2 threads. INFO: graphchi_engine.hpp(graphchi_engine:154): Initializing graphchi_engine. This engine expects 4-byte edge data. INFO: chifilenames.hpp(load_vertex_intervals:400): shard: 0 - 5482284 INFO: graphchi_engine.hpp(run:737): GraphChi starting INFO: graphchi_engine.hpp(run:738): Licensed under the Apache License 2.0 INFO: graphchi_engine.hpp(run:739): Copyright Aapo Kyrola et al., Carnegie Mellon University (2012) DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size: 500901304, dataset.edata.e4B.0_1sizeof(ET): 4 INFO: graphchi_engine.hpp(print_config:132): Engine configuration: INFO: graphchi_engine.hpp(print_config:133): exec_threads = 24 INFO: graphchi_engine.hpp(print_config:134): load_threads = 4 INFO: graphchi_engine.hpp(print_config:135): membudget_mb = 800 INFO: graphchi_engine.hpp(print_config:136): blocksize = 1048576 INFO: graphchi_engine.hpp(print_config:137): scheduler = 1 INFO: graphchi_engine.hpp(run:773): Start iteration: 0 INFO: graphchi_engine.hpp(run:852): 0.064265s: Starting: 0 -- 5482284 INFO: graphchi_engine.hpp(run:865): Iteration 0/5, subinterval: 0 - 5482284 DEBUG: graphchi_engine.hpp(run:880): Allocation 5482285 vertices, sizeof:64 total:350866240 ERROR: memoryshard.hpp(load_edata:320): Did not find block dataset.edata.e4B.0_1_blockdir_1048576/329 ERROR: memoryshard.hpp(load_edata:321): Going to exit... DEBUG: memoryshard.hpp(load_edata:327): Compressed/full size: 0.688722 number of blocks: 478 itemcf: ../../src/shards/memoryshard.hpp:329: void graphchi::memory_shard<VT, ET, svertex_t>::load_edata() [with VT = unsigned int; ET = unsigned int; svertex_t = graphchi::graphchi_vertex<unsigned int, unsigned int>]: Assertion `blockid == nblocks' failed. Aborted (core dumped) tarjan:~/graphchi-cpp

ulimit -n 1024

tarjan:~/graphchi-cpp

ulimit -n 50000 tarjan:~/graphchi-cpp

NOW THE SAME CODE WORKS!

./toolkits/collaborative_filtering/itemcf --distance=4 --training=dataset --K=350 --quiet=0 --nshards=1 --clean_cache=1 --min_allowed_intersection=48 --allow_zeros=1 WARNING: common.hpp(print_copyright:195): GraphChi Collaborative filtering library is written by Danny Bickson (c). Send any comments or bug reports to danny.bickson@gmail.com [distance] => [4] [training] => [dataset] [K] => [350] [quiet] => [0] [nshards] => [1] [clean_cache] => [1] [min_allowed_intersection] => [48] [allow_zeros] => [1] WARNING: chifilenames.hpp(find_shards:271): Could not find shards with nshards = 1 WARNING: chifilenames.hpp(find_shards:272): Please define 'nshards 0' or 'nshards auto' to automatically detect. INFO: sharder.hpp(start_preprocessing:370): Starting preprocessing, shovel size: 17476266 INFO: io.hpp(compute_matrix_size:136): Starting to read matrix-market input. Matrix dimensions: 5466961 x 15325, non-zeros: 125225326 INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.0.shovel, max:5475672 INFO: sharder.hpp(flush:193): Sort done.dataset4.0.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.1.shovel, max:5477316 INFO: sharder.hpp(flush:193): Sort done.dataset4.1.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.2.shovel, max:5477394 INFO: sharder.hpp(flush:193): Sort done.dataset4.2.shovel INFO: sharder.hpp(flush_shovel:407): Too many outstanding shoveling threads... INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.3.shovel, max:5479571 INFO: sharder.hpp(flush:193): Sort done.dataset4.3.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.4.shovel, max:5481431 INFO: sharder.hpp(flush:193): Sort done.dataset4.4.shovel INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.5.shovel, max:5481431 INFO: sharder.hpp(flush:193): Sort done.dataset4.5.shovel INFO: sharder.hpp(flush_shovel:407): Too many outstanding shoveling threads... INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.6.shovel, max:5481997 INFO: sharder.hpp(flush:193): Sort done.dataset4.6.shovel INFO: io.hpp(convert_matrixmarket:564): Global mean is: 317.101 Now creating shards. INFO: sharder.hpp(flush:191): Sorting shovel: dataset4.7.shovel, max:5482284 INFO: sharder.hpp(flush:193): Sort done.dataset4.7.shovel INFO: sharder.hpp(flush_shovel:401): Waiting shoveling threads... INFO: io.hpp(convert_matrixmarket:578): Now creating shards. INFO: sharder.hpp(determine_number_of_shards:586): Number of shards to be created: 1 INFO: sharder.hpp(write_shards:967): Edges per shard: 125225327 nshards=1 total: 125225326 INFO: sharder.hpp(write_shards:973): Buffer size in merge phase: 52428804 INFO: sharder.hpp(finish_shard:625): Starting final processing for shard: 0 DEBUG: sharder.hpp(finish_shard:636): Shovel size:1502703912 edges: 125225326 DEBUG: sharder.hpp(finish_shard:702): 0 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 10000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 20000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 30000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 40000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 50000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 60000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 70000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 80000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 90000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 100000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 110000000 / 125225326 DEBUG: sharder.hpp(finish_shard:702): 120000000 / 125225326 INFO: sharder.hpp(createnextshard:887): Remaining edges: 0 remaining shards:0 edges per shard=125225327 INFO: sharder.hpp(createnextshard:890): Edges per shard: 125225327

=== REPORT FOR sharder() === [Timings] edata_flush: 0.343608s (count: 478, min: 0.000667s, max: 0.000944, avg: 0.000718845s) execute_sharding: 21.5186 s finish_shard.sort: 9.32435 s preprocessing: 53.9705 s shard_final: 14.3798 s [Other] app: sharder INFO: sharder.hpp(done:903): Created 1 shards, for 125225326 edgesSuccessfully finished sharding for dataset INFO: io.hpp(convert_matrixmarket:583): Created 1 shards. INFO: itemcf.cpp(main:514): M = 5466961 DEBUG: stripedio.hpp(stripedio:271): Start io-manager with 2 threads. INFO: graphchi_engine.hpp(graphchi_engine:154): Initializing graphchi_engine. This engine expects 4-byte edge data. INFO: chifilenames.hpp(load_vertex_intervals:400): shard: 0 - 5482284 INFO: graphchi_engine.hpp(run:737): GraphChi starting INFO: graphchi_engine.hpp(run:738): Licensed under the Apache License 2.0 INFO: graphchi_engine.hpp(run:739): Copyright Aapo Kyrola et al., Carnegie Mellon University (2012) DEBUG: slidingshard.hpp(sliding_shard:213): Total edge data size: 500901304, dataset.edata.e4B.0_1sizeof(ET): 4 INFO: graphchi_engine.hpp(print_config:132): Engine configuration: INFO: graphchi_engine.hpp(print_config:133): exec_threads = 24 INFO: graphchi_engine.hpp(print_config:134): load_threads = 4 INFO: graphchi_engine.hpp(print_config:135): membudget_mb = 800 INFO: graphchi_engine.hpp(print_config:136): blocksize = 1048576 INFO: graphchi_engine.hpp(print_config:137): scheduler = 1 INFO: graphchi_engine.hpp(run:773): Start iteration: 0 INFO: graphchi_engine.hpp(run:852): 0.17536s: Starting: 0 -- 5482284 INFO: graphchi_engine.hpp(run:865): Iteration 0/5, subinterval: 0 - 5482284 DEBUG: graphchi_engine.hpp(run:880): Allocation 5482285 vertices, sizeof:64 total:350866240 DEBUG: memoryshard.hpp(load_edata:327): Compressed/full size: 1 number of blocks: 478 INFO: graphchi_engine.hpp(run:889): Start updates INFO: graphchi_engine.hpp(exec_updates_inmemory_mode:470): In-memory mode: Iteration 0 starts. (6.66187 secs) entering iteration: 0 on before_exec_interval pivot_st is 5466961 window_en 5482285 DEBUG: itemcf.cpp(before_exec_interval:464): Window init, grabbed: 0 edges extending pivor_range to : 5482286 DEBUG: itemcf.cpp(before_exec_interval:466): Window en is: 5482285 vertices: 5482285 INFO: graphchi_engine.hpp(exec_updates_inmemory_mode:470): In-memory mode: Iteration 1 starts. (8.83943 secs)

tinylamb commented 9 years ago

In most cases this error happens when you are out of file descriptors. Try to increase the number of open file descriptors using the command "ulimit -n 10000” in the same terminal before running graphchi