Closed Ateuchus closed 5 years ago
I was just wondering if there was a function to remove these samples from the set easily, or whether I should just go and find these files and remove them myself and start again.
Yes this is admittedly a little annoying. The way we suggest doing this is to check for the existence of the files after filtering, and then drop those that don't exist.
filterAndTrim(fnFs, filtFs, ...)
not.lost <- file.exists(filtFs) # alternatively, not.lost <- out[,"reads.out"] > 0
filtFs <- filtFs[not.lost]
# possibly also update filtRs and sample.names at this point in the same way
Also, on a slightly different note- I'm just testing out the pipeline on some CO1 metabarcoding data. I can't really see why this wouldn't work, but I read someone posted that their results contained a lot of stop codons. Do you know if this an appropriate pipeline to use/ has it been compared to any other method?
In principle it should work on CO1 data, but admittedly I have not done any testing personally on CO1 data. I'm also not aware of any published validations. I think it should work, but some double-checking is justified (e.g. for stop codons).
Lovely, thank you!
So I've just had a look at what's been generated, and am getting a lot of stop codons as well. A little more searching has revealed that this is probably a common issue with CO1 metabarcoding in general, as a lot of the published sequences also have the same issue. Now trying to figure out what to do next!
Hi,
I have a number of samples that don't pass the filtering- which is absolutely fine (I don't want to try and salvage them)- and so there is no new file written in the filtered output (makes sense). However, when I then go to plot errors, it is obviously looking for those non-existent files. I was just wondering if there was a function to remove these samples from the set easily, or whether I should just go and find these files and remove them myself and start again.
I know, I know.. I'm just being lazy- but If nothing else, I think it would be a nice little addition to the pipeline!
This is the error:
errF <- learnErrors(filtFs, multithread=TRUE) Error in open.connection(con, "rb") : cannot open the connection In addition: Warning message: In open.connection(con, "rb") : cannot open file 'Blahblah/filtered/125_P_B_F_filt.fastq.gz': No such file or directory
(and, just so you can see) The filter removed all reads: blahblah/filtered/125_P_B_F_filt.fastq.gz and blahblah/filtered/125_P_B_R_filt.fastq.gz not written.
Also, on a slightly different note- I'm just testing out the pipeline on some CO1 metabarcoding data. I can't really see why this wouldn't work, but I read someone posted that their results contained a lot of stop codons. Do you know if this an appropriate pipeline to use/ has it been compared to any other method? Had a quick look at the literature and can't find much. Personally I like this program for my 16s/18s sets, so I'm hoping it all works.
Thank you!