csbl-usp / CEMiTool

Co-Expression Module Identification Tool (CEMiTool) official repository
22 stars 9 forks source link

No diagnostics #36

Closed cagias closed 4 years ago

cagias commented 5 years ago

Hello, I already know that standard beta value won't work with my data, so I intend to require diagnostic_report(cem) to manually set it. However, I'm facing a problem with the graphs making:

counts <- read.table("DatExpr.txt", header = TRUE) >row.names(counts) <- counts$genes >counts <- counts[-c(1)] >sample_annot <- read.table("cemitool-phenotypes.tsv", header = TRUE, sep = "\t") >gene_sets <- read_gmt("blast2go_gene_sets.gmt") > cem <- cemitool(counts, sample_annot, gene_sets, apply_vst = TRUE, filter = TRUE, force_beta = TRUE, sample_name_column = 'SampleName', class_column = 'Class', plot = TRUE, plot_diagnostics = TRUE, verbose = TRUE) Including sample annotation ... Plotting diagnostic plots ... ...Plotting mean and variance scatterplot ... ...Plotting expression histogram ... ...Plotting qq plot ... ...Plotting sample tree ... Error in dev.off() : cannot shut down device 1 (the null device) >diagnostic_report(cem)

Above, a few lines from my input data: >head(counts) #20223x15

Ac0h Ac12h Ac24h At12h At24h Bc0h Bc12h Bc24h Bt12h Bt24h Cc0h Cc12h Cc24h Ct12h Ct24h
gene20201 3 2 0 1 1 2 4 1 2 0 1 3 1 0 0
gene20203 41 14 3 8 15 14 8 3 9 4 11 10 8 9 3
gene20205 25 3 5 14 21 7 5 4 6 11 2 5 7 6 6
gene20206 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
gene20209 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
gene20210 11 9 2 1 12 0 4 2 4 2 2 2 3 3 2

>head(sample_annot)#2x15

SampleName Class
Ac0h Control 0h
Bc0h Control 0h
Cc0h Control 0h
Ac12h Control 12h
Bc12h Control 12h
Cc12h Control 12h

>head(gene_sets) #136660x2

term gene
GO:0005344 gene2972
GO:0015749 gene19777
GO:0015749 gene9398
GO:0051188 gene3204
GO:0051188 gene12046
GO:0051188 gene3706

I suppose there's something wrong with some of my data (although I've checked them several times), cem object is not right therefore when I call diagnostic_report(cem) graphs can't be made thus there's no "device" to shut down, resulting in a html file with messages "Please create [...] plot" instead of graphs; but how can I be sure about it?

Best,

Carol

pedrostrusso commented 5 years ago

Hi @cagias, thanks for using CEMiTool. I created a mock example using the data you provided, only extending the sample_annotfile so it corresponded to all the sample names in your counts file, and I was unable to reproduce your error. Specifically, the plot_sample_tree function is the part in the code giving the error you have, so use the template below and see if you still get an error with your data, and if so, please provide a dataset with which I can reproduce your error.

cem <- new('CEMiTool', expression=counts, annot=sample_annot, sample_name_column="SampleName", class_column="Class", parameters=list(filter=FALSE, apply_vst=FALSE, filter_pval=0.5))

cem <- plot_sample_tree(cem)

show_plot(cem, "sample_tree")

Moreover, please note that your counts file should ideally be pre-processed and normalized before submitting to CEMiTool. Finally, since you intend to determine the beta parameter via the diagnostics report, you should set the force_beta argument to FALSE

cagias commented 4 years ago

Hello again @pedrostrusso , sorry for the late answer.

I couldn't run the code you suggested, getting the following message: Error in checkSlotAssignment(object, name, value) : ‘annot’ is not a slot in class “CEMiTool”

Nevertheless, I was able to overcome that error message I was getting before (Error in dev.off() : cannot shut down device 1 (the null device)) by simply closing all the devices before running the code again (dev.set(dev.next())).

I chose to keep force_beta as TRUE to see if it would work, and it did! So I finally succeeded running CEMiTool although I still have some doubts. I run the code like the following:

sample_annot <- read.table("cemitool-phenotypes.tsv", header = TRUE, sep = "\t")
gene_sets <- read_gmt("blast2go_gene_sets.gmt")

#load expression info and apply vst 
library(DESeq2)
countsTable <- read.delim('condition_table_all', row.names=NULL, header=TRUE)
sampleTable <- data.frame(sampleName = countsTable$SampleName, fileName = countsTable$FileName, condition = countsTable$condition)
ddsGR <-DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = 'D:/CEMiTool/', design = ~ condition )
ddsGR <- estimateSizeFactors(ddsGR)
ddsGR <- estimateDispersions(ddsGR)
vstData <- getVarianceStabilizedData(ddsGR)
#turn into data frame 
counts = as.data.frame(vstData)

#run cemitool
cem <- cemitool(counts, sample_annot, gene_sets, apply_vst = FALSE, force_beta = TRUE, filter = TRUE, sample_name_column = 'SampleName', 
                class_column = 'Class', plot = TRUE, plot_diagnostics = TRUE, verbose = TRUE)
cem
# inspect modules
nmodules(cem)
head(module_genes(cem))

library(ggplot2)
# create report as html document
generate_report(cem, directory="./Report", force=TRUE)
# write analysis results into files
write_files(cem, directory="./Tables", force=TRUE)
# save all plots
save_plots(cem, "all", directory="./Plots", force=TRUE)

diagnostic_report(cem, force=TRUE)
  1. Is varianceStabilizingTransformation from DeSeq2 package an acceptable normalization?
  2. Is there any way to get interaction info from CEMiTool to create the network like in this vignette? Is it possible to use the Cytoscape input file from WGCNA?
pedrostrusso commented 4 years ago

Hi @cagias, Sorry, I mistyped the argument name, it should have been sample_annot, not annot. Anyway, I'm glad you were able to solve your original problem. As for the force_beta argument, it works because as the name implies, it forces a beta argument, even if it isn't necessarily the best. I would strongly recommend you first try running the analysis with force_beta = FALSE, and only if the program is unable to determine a suitable beta parameter, then try forcing it.

Any transformation is adequate for CEMiTool as long as they don't interfere with the correlation values between genes. The vst from DESeq2 is fine. The interaction data from that vignette is meant as an example only, and we do not provide data for executing these analyses. This data in particular was obtained from the GeneMANIA database.

cagias commented 4 years ago

Hi @pedrostrusso,

Still not working, gave me the same error message: cem <- new('CEMiTool', expression=counts, sample_annot=sample_annot, interaction=int, sample_name_column="SampleName", class_column="Class", parameters=list(filter=FALSE, apply_vst=FALSE, filter_pval=0.5)) Error in checkSlotAssignment(object, name, value) : ‘sample_annot’ is not a slot in class “CEMiTool” I'm not gonna use it but just for you to know.

I ran again with force_beta = FALSE and it was Unable to find parameter beta. I got the diagnostic report, if I understood right Beta x R2 Plot shows the best beta value might be 10, which was the one chosen when force_beta = TRUE.

I also succeeded on getting the interaction network using as input the columns "fromNode" and "toNode" from the WGCNA Cytoscape file. Thanks.