Open bluegenes opened 1 year ago
87 GB seems like an awful lot, though! Are the FASTQ files being read into memory completely or something?
87 GB seems like an awful lot, though! Are the FASTQ files being read into memory completely or something?
..yep
using #123: 15min, 1.4Gb
...manysketch is done! results in 'mgx.fromfile5.zip'
Command being timed: "sourmash scripts manysketch mgx.fromfile5.csv -p dna,k=21,k=31,k=51,scaled=1000,abund -c 6 -o mgx.fromfile5.zip"
User time (seconds): 3168.33
System time (seconds): 63.27
Percent of CPU this job got: 364%
Elapsed (wall clock) time (h:mm:ss or m:ss): 14:46.04
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1373784
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 15
Minor (reclaiming a frame) page faults: 3307897
Voluntary context switches: 4935
Involuntary context switches: 48727
Swaps: 0
File system inputs: 0
File system outputs: 380344
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
sourmash sig summarize mgx.fromfile5.zip
== This is sourmash version 4.8.3. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==
** loading from 'mgx.fromfile5.zip'
path filetype: ZipFileLinearIndex
location: /home/ntpierce/2023-bench-manysketch/mgx.fromfile5.zip
is database? yes
has manifest? yes
num signatures: 15
** examining manifest...
total hashes: 25830111
summary of sketches:
5 sketches with DNA, k=31, scaled=1000, abund 8337030 total hashes
5 sketches with DNA, k=51, scaled=1000, abund 10733459 total hashes
5 sketches with DNA, k=21, scaled=1000, abund 6759622 total hashes
see benchmarks for all of GTDB rs217 here: https://github.com/sourmash-bio/pyo3_branchwater/pull/96#issuecomment-1709190601
tl;dr 40 minutes, 64 threads, 2.7 GB of RAM.
using some mgx colton was using for
sketchall
testing in https://github.com/sourmash-bio/sourmash/issues/2748:data from
/home/baumlerc/download-seq/download-seq/fastq/
ran in
/home/ntpierce/2023-bench-manysketch
:output:
/usr/bin/time -v
results: 42 minssourmash sig summarize
: