Open bheimbu opened 1 year ago
Hello,
There are a few options available to you:
bowtie2-build
has a --packed
mode that should reduce the memory footprint but
is slower than the standard build.Thanks,
for your reply. I'll try to use --packed
and see how it goes.
Cheers Bastian
Hello, I guess there might be many genomes in NCBI collection which may be very similar or possibly identical too. How does bowtie performs the read assigment in this case? It randomly assignes reads to one sequence from the pool of identical sequences? or it equally distribute the reads to all identical sequences? Thank you. I know in ideal scenario if is good to dereplicate genomes first.
Hello,
bowtie2
will chose the alignment with the highest alignment score. If there are multiple of these it will chose an alignment at random. I hope this helps.
Hi there,
I'm trying a build a huge index of NCBI's Refseq bacterial genomes, which is about 97 GB (in fna.gz format). I'm working on a HPC with 512 GB RAM but it still dies always with an "out-of-memory" error. Is it possible to split up the compressed fasta file in smaller chunks, index them separately, and then concatenate the resulting indexing files in the end? Or is there another solution (use more RAM)?
Cheers Bastian