I would like to try out Whisper for mapping paired end short reads against a very large and fragmented genome assembly (100s of thousands of contigs). It is inconvenient and a file system hog to split such a reference sequence into one individual file for every contig, but my understanding of the instructions for genome indexing is that this is needed as the program does not index FASTA files with multiple accessions in them.
Do I understand the instructions correct? If so, I would like to make a feature request such that Whispercan index mutiple fasta sequence files :-)
Whisper supports FASTA files with multiple accessions, though we haven't tested it on datasets having so many contigs. If you have problems with running it, please let us know.
Dear developers,
I would like to try out Whisper for mapping paired end short reads against a very large and fragmented genome assembly (100s of thousands of contigs). It is inconvenient and a file system hog to split such a reference sequence into one individual file for every contig, but my understanding of the instructions for genome indexing is that this is needed as the program does not index FASTA files with multiple accessions in them.
Do I understand the instructions correct? If so, I would like to make a feature request such that Whispercan index mutiple fasta sequence files :-)
Cheers!