Closed iqbal-lab closed 5 years ago
This process eventually finished after 25 hours
k7 (8 thread) results Count all reads: 67115296 Count skipped reads: 630240 Count mapped reads: 185418
Timer report: seconds Load data 38.42 Quasimap 512413 Total elapsed time: 512452
For comparison, k5 results (8 threads)
Count all reads: 67115296 Count skipped reads: 630240 Count mapped reads: 76199
Timer report: seconds Load data 37.32 Quasimap 29117.6
Different number of mapped reads surprised me a bit
I've restarted a new run, to see if it is reproducible
Also seen here https://github.com/iqbal-lab-org/gramtools/issues/117 going from k5 to 10, time taken goes from 1.75 to about 7 hours (8 threads) . See bottom tables in main description of the issue.
With k11, using the standard Plasmodium PRG
/nfs/research1/zi/projects/gramtools/standard_datasets/pfalciparum/pf3k_release3_cortex_plus_dblmsps/gram_k11/
Quasimap 4.7 million 150bp reads with 1 thread takes 187,000 seconds (40 hours).
Quasimapping with 8 threads takes 45,000 seconds (12.5 hours).
Maybe this can be closed
@iqbal-lab Was the number of mapped reads consistent between runs?
Precisely the same!
@iqbal-lab In a previous comment you showed that the number of mapped reads was erroneously inconsistent between kmer sizes (when using multiple threads?). Do we know if that issue persists?
I am now running at k13, will confirm
With k11, using the standard Plasmodium PRG /nfs/research1/zi/projects/gramtools/standard_datasets/pfalciparum/pf3k_release3_cortex_plus_dblmsps/gram_k11/ Quasimap 4.7 million 150bp reads with 1 thread takes 187,000 seconds (40 hours). Quasimapping with 8 threads takes 45,000 seconds (12.5 hours). Maybe this can be closed
@iqbal-lab Where can I find the quasimap output directory for the above please?
/nfs/research1/zi/zi/analysis/2018/0920_test_gramtools_for_leffler/quasimapk11
Tested multi-threading of quasimap by mapping 250,000 reads (so 500,000 with the reverse complements) from (yoda) /nfs/leia/research/iqbal/bletcher/Pf_benchmark/all_reads/original/PG0496-C.bam to (big) pf3k prg.
Results: Time in seconds.
So here multi-threading gives no speedup.
All runs produced consistent results for reads mapped:
Count all reads: 500000 Count skipped reads: 190 Count mapped reads: 204169
Can you give exact command line and lsf command?
Can you give exact command line and lsf command?
bsub -R select[mem>60000] rusage[mem=60000] -M60000 -J threads_4 -n 4 -o /nfs/leia/research/iqbal/bletcher/Pf_benchmark/tests/threads/logs/t4_k8.o \ -e /nfs/leia/research/iqbal/bletcher/Pf_benchmark/tests/threads/logs/t4_k8.e singularity exec /nfs/leia/research/iqbal/bletcher/Singularity_Images/8b46a86_gramtools.img gramtools quasimap \ --gram-dir /nfs/leia/research/iqbal/bletcher/Pf_benchmark/tests/gram_k8 --run-dir /nfs/leia/research/iqbal/bletcher/Pf_benchmark/tests/threads/run_t4 \ --reads /nfs/leia/research/iqbal/bletcher/Pf_benchmark/tests/subsetted_reads/PG0496-C.trim.fq.1.1MSubset.gz --max-threads 4
UPDATE:
The previous plot shows total CPU time and not elapsed real (wall clock) time.
Looking at wall clock time we see multi-threading really is working:
Of interest, k=8 achieves a 4.1-fold speedup compared to only 2.7-fold speedup on k=11, going from 1 thread to 10 threads.
Hello!
Nice! Thanks for this! If it is not too much to ask, could you also make an additional plot please? I think one showing the true speed-up and the theoretical best speed-up (see for e.g. https://stackoverflow.com/questions/26514264/plot-speed-up-curve-vs-number-of-openmp-threads-scalability ) can be useful for us to know if we should try to improve multithreading or not!
Thanks!
Cheers!
I'm running on a single dedicated server (not shared)
Running with this commit
With k=5, and 8 threads, quasimapping of a fixed fastq to a fixed PRG takes 1 hr 45 mins. Output showed 67 million reads processed
With k=7 and 8 threads, it is still running 15 hours after starting, output shows it has processed 37 million (last print out was 2 hours ago)
Machine: ebi7-017 Command
gramtools quasimap --gram-directory results/gramk7 --reads fastq/out.fq.gz --max-threads=8 2>> error_qmapk7_thread8 1>> output_qmapk7_thread8
pwd /tmp/benchmarking