amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

Has any version can index fasta file more large than 4.2G? #122

Closed Winjor closed 3 years ago

Winjor commented 4 years ago

Hi,I see the same question in google group,but not solved yet. Are there any way or any version of SNAP can do that?
By the way, I see the version is updated to 1.0beta.24 in master from update log and actively in some branches, any latest version will release?

bolosky commented 4 years ago

Any version of SNAP should be able to index a genome of any practical size. If it’s bigger than 232 bases, or really even close to that, you’ll need to say -locationSize 5 when you build the index. That’s even true for the human genome if you use a small seed size (for reasons too complicated to explain here). If you try to index something too big, you should get an error message telling you to increase -locationSize. If you have something truly huge (like 240 bases) and the memory to hold such an index, maybe you’ll need -locationSize 6.

We’re working on a set of major updates to SNAP. In particular, we’ve added affine gap scoring with which SNAP calls indels with about the same precision and recall as bwa-mem, but much, much faster. I’m not sure of the schedule for actually releasing it, but it shouldn’t be too long.

--Bill

From: Winjor Chen notifications@github.com Sent: Tuesday, September 24, 2019 12:50 AM To: amplab/snap snap@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [amplab/snap] Has any version can index fasta file more large than 4.2G? (#122)

Hi,I see the same question in google group,but not solved yet. Are there any way or any version of SNAP can do that? By the way, I see the version is updated to 1.0beta.24 in master from update log and actively in some branches, any latest version will release?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Famplab%2Fsnap%2Fissues%2F122%3Femail_source%3Dnotifications%26email_token%3DAAHPTWN3JP4BLXRXE6P4G33QLHBBNA5CNFSM4IZ4BR62YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HNHQYQA&data=02%7C01%7Cbolosky%40microsoft.com%7Ce6a9f816ec154753d38708d740c3c2c7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637049081844388074&sdata=QGeZT3dkhus3QkNGnrvNnuK86WBylWwtenWZmaUvaow%3D&reserved=0, or mute the threadhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHPTWI7KJJHIO533U244XDQLHBBNANCNFSM4IZ4BR6Q&data=02%7C01%7Cbolosky%40microsoft.com%7Ce6a9f816ec154753d38708d740c3c2c7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637049081844388074&sdata=FsdWHtYSQpSuVRkMAJZwNmKWwoTdt8RHk%2BmDsnn0ZQI%3D&reserved=0.