Add function to read fasta subset

grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data

http://grunwaldlab.github.io/metacoder_documentation

Other

136 stars 28 forks source link

Add function to read fasta subset #69

Open zachary-foster opened 8 years ago

zachary-foster commented 8 years ago

Big fasta files are common. I just ran into one that would require 14GB of RAM to read into R using ape::read.dna. I need a way to read into subsets of large fasta files.

The following should be possible:

Read in a defined set of sequence indexes (e.g. c(3, 2) would read the third and second sequence)
Read a random subset. This can use the code for the defined subset, but requires knowing the number of sequences in the file.

zachary-foster commented 8 years ago

@knausb suggested using readr for handling gzipped file inputs