Big fasta files are common. I just ran into one that would require 14GB of RAM to read into R using ape::read.dna. I need a way to read into subsets of large fasta files.
The following should be possible:
Read in a defined set of sequence indexes (e.g. c(3, 2) would read the third and second sequence)
Read a random subset. This can use the code for the defined subset, but requires knowing the number of sequences in the file.
Big fasta files are common. I just ran into one that would require 14GB of RAM to read into R using
ape::read.dna
. I need a way to read into subsets of large fasta files.The following should be possible:
c(3, 2)
would read the third and second sequence)