Closed gene-drive closed 3 years ago
If your data really does have 75% contamination, you probably shouldn't use it as that level of contamination likely indicates something went very wrong in the experiment.
To check, I would look at the plot generated by autoEstCont
. Does it have two peaks of roughly equal height, with one being around .75? If so, try setting your contamination to the location of the lower peak and proceeding with your analysis.
The other thing I'd do is make extensive use of plotMarkerMap
to see what the expression ratio to the soup looks like for a few genes that are commonly contamination. Without knowing your experiment it's hard to say what these are likely to be, but HB and IG genes usually work.
Hi,
I am getting the same error with one my Sample "Error in setContaminationFraction(sc, exp(coef(sc$fit)), forceAccept = forceAccept) : Extremely high contamination estimated (0.61). This likely represents a failure in estimating the contamination fraction. Set forceAccept=TRUE to proceed with this value."
Thea autoEstCont looks very different. I am using only one gene set nonExpressedGeneList = list(Hep=c("CYP1A2","CYP2E1","CYP3A4","GLUL","DCXR","FTL","GPX2","GSTA1","CYP2A7","FABP1","HAL","AGT","ALDOB","SDS"))
Interestingly the same Sample works when I am using a big geneset of which this one is part as well. Hep geneset is part of the lists below as well. However with these list some other samples fail.
nonExpressedGeneList = list( AntiB=c("IGKC","JCHAIN","IGHA1","IGLC1","IGLC2","IGLC3"), MatB=c("CD22","CD37","CD79B","FCRL1","LTB","DERL3","IGHG4"), CD3T=c("CD8A","CD8B","CD3D","CD3G","TRAC","IL32","TRBC1","TRBC2"), Hep=c("CYP1A2","CYP2E1","CYP3A4","GLUL","DCXR","FTL","GPX2","GSTA1","CYP2A7","FABP1","HAL","AGT","ALDOB","SDS"), LSEC=c("FCN2","CLEC1B","CLEC4G","PVALB","S100A13","GJA5","SPARCL1","CLEC14A","PLVAP","EGR3"), Eryth=c("HBB","HBA1","HBA2"), NKT=c("CSTW","IL7R","GZMB","GZMH","TBX21","HOPX","PRF1","S100B","TRDC","TRGC1","TRGC2","IL2RB","KLRB1","NCR1","NKG7","NCAM1","XCL2","XCL1","CD160","KLRC1"), Mac=c("VCAN","S100A8","MNDA","LYZ","FCN1","CXCL8","VCAN","VCAM1","TTYH3","TIMD4","SLC40A1","RAB31","MARCO","HMOX1","C1QC"), Chol=c("PROM1","SOX9","KRT7","KRT19","CFTR","EPCAM","CLDN4","CLDN7","ANXA4","TACSTD2"), Stel=c("ACTA2","COL1A1","RBP1","TAGLN","ADAMTSL2","GEM","LOXL1","LUM"), Endo=c("PECAM1","TAGLN","VWF","FLT1","MMRN1","RSPO3","LYPD2","LTC4S","TSHZ2","IL1R1") )
So in short, I need to solve this issue and any help will be appreciated.
Thank you
I've tried running the automated workflow on my dataset and am getting the below message.
I'm very new to bioinformatics and scRNA-seq analysis and am wondering how to proceed. What should I do to check if this is "real" before moving on and correcting expression profile.
I've been trying to do some of the visual sanity checks such as mentioned in the vignette but it seems I first need to do the "manual method" to estimate the contamination fraction. However after reading through the vignette several times I'm still confused on the exact code I need to run. I keep running into error "'x' must be an array of at least two dimensions".