benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
464 stars 142 forks source link

error in dimnames(x) <- dn #1563

Closed marclliros closed 3 months ago

marclliros commented 2 years ago

Dear all I am running dada2 on ITS samples. After sample inference, merging paired reads and constructing sequence (with and without chimera) tables, I came across an error that I am not able to solve. I run:

getN<-function(x) sum(getUniques(x)) trackITS<-cbind(out,sapply(dadaFs,getN),sapply(dadaRs,getN),sapply(mergers,getN),rowSums(seqtabITS.nochim)) colnames(trackITS)<-c("input","filtered","denoisedF","denoisedR","merged","nonchim") rownames(trackITS)<-sample.names Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent head(trackITS) input filtered denoisedF denoisedR merged nonchim EB001p_S2_L001_R1_001.fastq 299932 213198 212641 212851 205251 200463 EB002p_S4_L001_R1_001.fastq 244700 134400 134078 134070 93801 81507 EB003p_S5_L001_R1_001.fastq 324954 221483 221143 221090 204681 199863 EB004p_S6_L001_R1_001.fastq 251084 131688 131375 131389 83223 72636 EB005p_S7_L001_R1_001.fastq 234883 136173 135926 135777 98059 90871 EB006p_S8_L001_R1_001.fastq 325421 160678 160374 160363 87148 79784 exists<-file.exists(filtFs) names(dadaFs)<-sample.names[exists] Error in names(dadaFs) <- sample.names[exists] : 'names' attribute [23] must be the same length as the vector [22]

I know from output that I miss one sample due to zero's (negative control in fact). After looking some old and new issues, exists option must be used, but apparently I did not came across the right "option". By running ..

out<-out[file.exists(filtFs),] trackITS<-cbind(out,sapply(dadaFs,getN),sapply(dadaRs,getN),sapply(mergers,getN),rowSums(seqtabITS.nochim)) colnames(trackITS)<-c("input","filtered","denoisedF","denoisedR","merged","nonchim") rownames(trackITS)<-sample.names Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

doesn't solve the problem ... so I'm wondering if I must run "[file.exists(),]" for all dadaFs, dadaRs, mergers and seqtabITS.nochim .. which doesn't look practic.

Any advice?

Thanks in advance

Marc PS, sorry if I miss the solution in a previous post

benjjneb commented 2 years ago

The error is only coming when you assign rownames to trackITS? If so, you just need to match up the sample.names with what is in trackITS.

My guess is that you can simply subset down to the existing files here, e.g. sample.names[file.exists(filtFs)]. But inspecting the length of sample.names versus the nrow(trackITS) will help confirm, the expectation being that there is one less row in trackITS than there is in sample.names, corresponding to the one sample that was lost at the filtering step.

marclliros commented 2 years ago

Dear @benjjneb thanks for your replay. Yes, it is a matter related with one less sample in the dataset due to filtering step (negative control of sequencing). I will run again with sample.names[file.exists(filtFs)] command Thanks a lot Marc