amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

snap showing high io utilization #114

Closed ramcn closed 6 years ago

ramcn commented 6 years ago

Hello,

I noticed that when running snap I see significant time is spent in IO wait when compared to BWA MEM or BOWTIE. I am using SYSSTAT to capture the CPU and IO utilization and kSAR tool to visualize the IO wait.

Below is the link to the visualization comparing SNAP with other tools snap-utilization

I am using SRR622461 dataset and an iintel Xeon with 28 cores and 256GB RAM and the dataset is on lustre file system. Here is more details about the system. link

Let me know if you have any insights.

bolosky commented 6 years ago

This is just a guess, since I don’t know the performance of your IO system and the parameters that you’re using, but I suspect that two different things are going on here.

First is the very IO bound early part of the run. That’s most likely SNAP loading its index. The index is very large relative to the FM index used by BWA and Bowtie, and so takes longer to load. Usually SNAP indices are 10s of GB. This time doesn’t depend on the size of the file you’re aligning, so if you’re doing a big file it will become relatively smaller.

SNAP is also slightly more IO bound in the latter phase because it’s a lot faster than BWA or Bowtie, so it needs to do less compute per byte of input/output file.

We recognized that this can take a lot of time, and added features to make it somewhat less painful. One is to memory map the index, so if you run SNAP more than once in a row with the same index will cause it not to have to be fetched from disk the second time. Look at the -map and -pre flags. Another is the comma syntax for the command line, which lets you run multiple alignments one after the other without unloading the index. Yet another is daemon mode, where you can start the SNAP process once and send it alignment jobs to run from the snap-command app. It will only load the index if it’s not already in memory. Of course, none of these help with loading the index the first time, or if you need to switch indices between consecutive runs.

--Bill

From: ramcn notifications@github.com Sent: Tuesday, April 17, 2018 9:00 PM To: amplab/snap snap@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [amplab/snap] snap showing high io utilization (#114)

Hello,

I noticed that when running snap I see significant time is spent in IO wait when compared to BWA MEM or BOWTIE. I am using SYSSTAT to capture the CPU and IO utilization and kSAR tool to visualize the IO wait.

Below is the link to the visualization comparing SNAP with other tools [snap-utilization]https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F2152628%2F38911033-de70f44c-4289-11e8-944b-1d646b74b303.png&data=02%7C01%7Cbolosky%40microsoft.com%7Cea6fc249ac584cf305bf08d5a4e0dbfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636596208031070086&sdata=hNV1t1b2qK0nBhJocoeM4EZNrTKn12joQur%2F59%2B0sZs%3D&reserved=0

I am using SRR622461 dataset and an iintel Xeon with 28 cores and 256GB RAM and the dataset is on lustre file system. Here is more details about the system. linkhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.chpc.utah.edu%2Fdocumentation%2Fguides%2Fkingspeak.php&data=02%7C01%7Cbolosky%40microsoft.com%7Cea6fc249ac584cf305bf08d5a4e0dbfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636596208031070086&sdata=bTP%2BfwTro29J%2FtMdnzDCk9pEOxZGowM38gdJAB0SYm4%3D&reserved=0

Let me know if you have any insights.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F114&data=02%7C01%7Cbolosky%40microsoft.com%7Cea6fc249ac584cf305bf08d5a4e0dbfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636596208031080095&sdata=Shbr9uGJQofpNVZeo7LVpu6mInZZitShXxNpdnkdCX4%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA752TMGDvdV0ScfEHTseDeQl9zAOCWHks5tprpAgaJpZM4TZYxG&data=02%7C01%7Cbolosky%40microsoft.com%7Cea6fc249ac584cf305bf08d5a4e0dbfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636596208031090099&sdata=AUTE%2BdRKGsj4lKNCar9pMzwLsfCF6EnStq2zmKhb6q0%3D&reserved=0.

ramcn commented 6 years ago

Thanks for the insights. I am guessing even other hash-based aligners (BLAST and mrsfast) might exhibit similar behavior.