Closed abalter closed 6 years ago
Can someone tell me exactly which row or column names are supposed to be matching but are not?
The samdf
data.frame should have rownames matching those of the otu_table, e.g. "F3D0", "F3D1", ...
Hey @benjjneb -- Still having a problem. Somehow, the samdf
object is getting, to put it technically, messed up.
> samdf <- read.csv("http://raw.githubusercontent.com/spholmes/F1000_workflow/master/data/MIMARKS_Data_combined.csv",header=TRUE)
> head(samdf)
> #summary(samdf)
> samdf$SampleID <- paste0(gsub("00", "", samdf$host_subject_id), "D", samdf$age-21)
> #summary(samdf)
> samdf <- samdf[!duplicated(samdf$SampleID),] # Remove dupicate entries for reverse reads
> head(samdf)
> #summary(samdf)
> rownames(seqtabAll) <- gsub("124", "125", rownames(seqtabAll)) # Fix discrepancy
> all(rownames(seqtabAll) %in% samdf$SampleID) # TRUE
[1] TRUE
> rownames(samdf) <- samdf$SampleIDlib
> keep.cols <- c("collection_date", "biome", "target_gene", "target_subfragment",
+ "host_common_name", "host_subject_id", "age", "sex", "body_product", "tot_mass",
+ "diet", "family_relationship", "genotype", "SampleID")
> print(keep.cols)
[1] "collection_date" "biome" "target_gene" "target_subfragment"
[5] "host_common_name" "host_subject_id" "age" "sex"
[9] "body_product" "tot_mass" "diet" "family_relationship"
[13] "genotype" "SampleID"
> samdf <- samdf[rownames(seqtabAll), keep.cols]
> head(samdf)
> #summary(samdf)
> rownames(samdf)
[1] "NA" "NA.1" "NA.2" "NA.3" "NA.4" "NA.5" "NA.6" "NA.7" "NA.8" "NA.9" "NA.10"
[12] "NA.11" "NA.12" "NA.13" "NA.14" "NA.15" "NA.16" "NA.17" "NA.18"
> all(rownames(seqtabAll) %in% samdf$SampleID) # TRUE
[1] FALSE
rownames(samdf) <- samdf$SampleIDlib
Might just be a typo? I.e., remove the extraneous "lib" from the end of that line.
Just figured that out! Doh!
This section of the workflow is throwing an error:
The error is:
I'm pretty sure I've gotten through the sample workflow without this error. Can someone tell me exactly which row or column names are supposed to be matching but are not?
These are the rownames of my variables: