Closed galud27 closed 6 years ago
You can very easily make the sequence table from the dada-class
objects:
st <- makeSequenceTable(dadaFs)
Is that what you want to do? Or do you actually want to "stack" the $clustering
data.frames from each sample into one giant data.frame?
Ben, Yes, I'm trying to stack all the $clustering data.frames of all my samples into one data.frame.
I'm able to do that when I have forward and reverse readings because I can generate the data.frame with all the samples stack together by doing: mergers <- mergePairs(dadaFs, derepFs, dadaRs, derepRs, verbose=TRUE)
Once I have the data.frame, I write csv files for the abundance, reverse and forward and other info in the data.frame: dir.create('merged') for(name in names(mergers)){ write.csv(mergers[[name]], paste0('merged/', name, '.csv'), quote = F, row.names = F) } What I'm finally hoping to do is to write fasta files ( all the fasta files generated including the unique) and run them in a different pipeline using a phylogenetic placement approach and compare this to other OTU clustering methods.
Let me know if you think this could be possible with all my single forwards reads I have now.
Thank you!!
So I think you can get an equivalent output to the above by just looping through the dadaFs
objects (which is a list, just like mergers
):
for(name in names(dadaFs)){
write.csv(dadaFs[[name]]$clustering, paste0('forward/', name, '.csv'), quote = F, row.names = F)
}
It won't have the same columns, but some will be the same (including $sequence
and $abundance
). Does that work?
You can also use the uniquesToFasta
function to write out fastas for each sample. Just do the same loop as above, but call uniquesToFasta(dadaFs[[name]], paste0('forward/', name, '.fa')
within the loop.
Yes, looping the dadaFs works and gives me the columns I need!
Just a quick question would the uniquesToFasta output would the same of the dadaFs data.frame output if I generate a fasta file using $sequence and $abundance?
I though that with the merger output I could generate all fasta sequence and the UniqueFasta would only give the most representative unique fasta.
Thank you so much for your help!
uniquesToFasta
will write a fasta that contains each sequences in the $sequence
column, with the $abundance
written in the id line of the fasta with size=XXX
format that is used by usearch/uchime.
Ok, great! Thank you.
Hi Benjamin, On a previous issue, I was asking you how to generate the data frame that you get when you do pair-end sequences using:mergers <- mergePairs(dadaFs, derepFs, dadaRs, derepRs, verbose=TRUE) You mentioned that the dada-class objects themselves have such a data.frame: dadaFs[[1]]$clustering (for sample 1) and so on. I have studies with many samples, and I was wondering if there is a way to join all the dadaFs for all the samples into one data frame.
I'm sorry I was trying to go on a different way and to generate myself the reverse readings and user mergers, but I don't think my sequences look good at all when I look at the quality profiles.
Thank you so much for your help!!