csbl-usp / CEMiTool

Co-Expression Module Identification Tool (CEMiTool) official repository
22 stars 9 forks source link

CEMiTool Error: Must request at least one colour from a hue palette. #48

Closed ruchiups closed 4 years ago

ruchiups commented 4 years ago

Hi , I am trying to run CEMiTool with a sample annotation file but get the following error-

cem <- cemitool(edata2.df, targets.ct, verbose = TRUE) Including sample annotation ... Plotting diagnostic plots ... ...Plotting mean and variance scatterplot ... ...Plotting expression histogram ... ...Plotting qq plot ... ...Plotting sample tree ... Error: Must request at least one colour from a hue palette.

Since the program is running I guess input files are correct. Please help can't really identify the issue.

targets.ct relevant columns look like this: SampleName Class ABP73-2 dis.Untr ABP73-3 dis.cll ABP73-4 dis.ge ABP73-5 dis.wt ABP73-6 dis.gsk ABP73-17 Normal.Untr ABP73-18 Normal.cll ABP73-19 Normal.ge ABP73-20 Normal.wt ABP73-21 Normal.gsk

and the expression file (edata2.df) like this- ABP73-2 ABP73-3 ABP73-4 ABP73-5 RFC2 5.162862 4.871644 5.620174 5.257355 HSPA6 6.130446 6.914137 6.893282 5.951551

pedrostrusso commented 4 years ago

Hi @ruchiups, thanks for using CEMiTool. A quick workaround for this issue just to get you up and running is to set the plot_diagnostics parameter to FALSE, and things should work as expected.

Now, to try looking for a fix. It looks like your Class column in the sample annotation file isn't a grouping variable, instead, they are all unique. Ideally, the column would have the same string for a group of samples, for example, "dis" for samples ABP73-2, ABP73-3, ABP73-4, ABP73-5, ABP73-6 and "Normal" for samples ABP73-17, ABP73-18, ABP73-19, ABP73-20, ABP73-21.

Also, please note that CEMiTool is designed to work with, at the very least, around 20 or so samples. Do you have more samples in your data?

ruchiups commented 4 years ago

@pedrostrusso, thank you very much for your prompt response. Setting plot_diagnostics to FALSE worked. But it seems there are other issues now, get the following message- _Could not specify the parameter Beta. No modules found. Plotting beta x R squared curve ... Plotting mean connectivity curve ... Unable to find parameter beta. Please re-run the cemitool function setting plot_diagnostics=TRUE and check diagnostic plots with function diagnostic_report().__

I only shared a small snapshot of my sample. I have 183 samples in total split as follows- ge wt gsk cll Untr dis 21 21 21 21 21 Normal 16 16 16 14 16 At least all the subsets in the dis group are over 20..just about. Gene expression data is available on 20150 genes.

Any insight into why beta could not be specified? Any suggestions on how to fix it?

pedrostrusso commented 4 years ago

Hi @ruchiups, is this RNAseq data or microarray? Also, a possible quick-fix might be to try setting the apply_vst parameter to TRUE. This removes a possible dependence between the mean and the variance in the data by applying Variance Stabilizing Transformation to the data. You can see if this dependence occurs in your data by running plot_mean_var on the cemitool object returned after the errored run you sent above.

ruchiups commented 4 years ago

@pedrostrusso it is microarray data

ruchiups commented 4 years ago

Hi @pedrostrusso , ran the suggested option as below-

cem <- cemitool(edata2.df, targets.ct, apply_vst = TRUE , verbose = TRUE) Error in if (x == 0 && a == 0) return(1) : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In sqrt(r) : NaNs produced 2: In sqrt(expr/r) : NaNs produced

plot_mean_var - R2 is very low doesn't seem a very strong relationship- mean_var.pdf

I used this option below instead of finding modules that seem to have run without error. Is this okay? if yes, at least this will allow understanding CEMiTool better. Please let me know.

cem <- find_modules(cem, force_beta = TRUE, min_ngen = 5,)

pedrostrusso commented 4 years ago

Hi @ruchiups yes, this is fine, analysis-wise. However, the force_beta argument ensures that the beta selection step is executed following step 6 of the WGCNA FAQ instead of the usual automatic selection method.

It might be best to first take a look at the result of cem <- plot_beta_r2(cem) and see if the curve has a reasonable tendency to stabilize at a relatively high value (something around at least 0.7). An example of an ideal outcome can be seen in Fig. 9 of our paper. If so, then your data are at least moderately adherent to the scale-free model, and setting the force_beta argument to TRUE shouldn't be an issue, as long as you're comfortable with the caveats presented in the FAQ I linked to above.