Closed kevinkle closed 4 years ago
dgraph bulk -r outputs/samples/ -s dgraph/kmers.schema --map_shards=1 --reduce_shards=1 --http localhost:8001 --zero=localhost:5080
REDUCE 33m38s [99.73%] edge_count:390.6M edge_speed:480.3k/sec plist_count:7.237M plist_speed:8.900k/sec
REDUCE 33m39s [99.92%] edge_count:391.3M edge_speed:480.6k/sec plist_count:7.251M plist_speed:8.906k/sec
badger 2019/07/08 11:53:02 INFO: Storing value log head: {Fid:7 Len:42 Offset:53101009}
REDUCE 33m40s [100.00%] edge_count:391.6M edge_speed:480.4k/sec plist_count:7.258M plist_speed:8.903k/sec
REDUCE 33m41s [100.00%] edge_count:391.6M edge_speed:479.8k/sec plist_count:7.258M plist_speed:8.892k/sec
REDUCE 33m42s [100.00%] edge_count:391.6M edge_speed:479.2k/sec plist_count:7.258M plist_speed:8.882k/sec
badger 2019/07/08 11:53:05 INFO: Force compaction on level 0 done
REDUCE 33m43s [100.00%] edge_count:391.6M edge_speed:478.9k/sec plist_count:7.258M plist_speed:8.876k/sec
Total: 33m43s```
this is for 40 genomes
kevin@panther ~/prairiedog> ls -lah out/0/p/
total 2.4G
drwx------ 2 kevin kevin 4.0K Jul 8 11:53 ./
drwx------ 3 kevin kevin 4.0K Jul 8 11:19 ../
-rw-r--r-- 1 kevin kevin 491M Jul 8 11:41 000000.vlog
-rw-r--r-- 1 kevin kevin 491M Jul 8 11:43 000001.vlog
-rw-r--r-- 1 kevin kevin 491M Jul 8 11:45 000002.vlog
-rw-r--r-- 1 kevin kevin 414M Jul 8 11:47 000003.vlog
-rw-r--r-- 1 kevin kevin 81M Jul 8 11:48 000004.vlog
-rw-r--r-- 1 kevin kevin 81M Jul 8 11:50 000005.vlog
-rw-r--r-- 1 kevin kevin 70M Jul 8 11:49 000006.sst
-rw-r--r-- 1 kevin kevin 81M Jul 8 11:52 000006.vlog
-rw-r--r-- 1 kevin kevin 70M Jul 8 11:49 000007.sst
-rw-r--r-- 1 kevin kevin 51M Jul 8 11:53 000007.vlog
-rw-r--r-- 1 kevin kevin 70M Jul 8 11:53 000012.sst
-rw-r--r-- 1 kevin kevin 50M Jul 8 11:53 000013.sst
-rw-r--r-- 1 kevin kevin 212 Jul 8 11:53 MANIFEST
intermediate rdf files are kind of large
-rw-r--r-- 1 kevin kevin 678M Jul 8 15:17 SRR5573131.fasta.rdf
-rw-r--r-- 1 kevin kevin 676M Jul 8 15:21 SRR5573135.fasta.rdf
-rw-r--r-- 1 kevin kevin 686M Jul 8 14:46 SRR5573137.fasta.rdf
-rw-r--r-- 1 kevin kevin 670M Jul 8 16:34 SRR5573138.fasta.rdf
-rw-r--r-- 1 kevin kevin 667M Jul 8 16:42 SRR5573139.fasta.rdf
-rw-r--r-- 1 kevin kevin 679M Jul 8 15:05 SRR5573142.fasta.rdf
-rw-r--r-- 1 kevin kevin 670M Jul 8 16:07 SRR5573145.fasta.rdf
Currently testing with 950 genomes
Need to map tmp/ of working directory when running bulk to larger disk
dgraph bulk deletes tmp/ before starting
Looks good, will go with this
Sub-issue of #106