csbl-usp / CEMiTool

Co-Expression Module Identification Tool (CEMiTool) official repository
22 stars 9 forks source link

Generate_report not working #55

Closed gourivadali closed 3 years ago

gourivadali commented 3 years ago

Hi,

I have been trying to run the cemitool with my own gmt and interaction file on normalized read-count data. I use the apply_vst = TRUE when running cemitool. Cemitool gives me some warning messages - _Warning messages: 1: executing %dopar% sequentially: no parallel backend registered 2: fun.y is deprecated. Use fun instead. 3: fun.y is deprecated. Use fun instead. 4: fun.y is deprecated. Use fun instead. 5: fun.y is deprecated. Use fun instead. 6: fun.y is deprecated. Use fun instead. 7: fun.y is deprecated. Use fun instead. 8: fun.y is deprecated. Use fun instead. 9: In max(abs(nesmelted$NES)) : no non-missing arguments to max; returning -Inf

Since these are warning messages, I tend to ignore them. I am trying to use the "generate_report" function which gives me the error- Quitting from lines 60-70 (report.Rmd) Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : replacement has length zero

Not sure what the issue is here. I am unable to take a look at the report.Rmd file to may be myself look at the issue. I have ran multiple datasets using the same pathway and interaction file and was able to generate report without any problems. It is with this one dataset that i am constantly seeing issues. Can someone please help me with this?

-Gouri

pedrostrusso commented 3 years ago

Hi Gouri, thanks for using CEMiTool. That's quite odd, how are you running CEMiTool? I can't find anywhere in the code where we would use ans[ypos] <- rep(yes, length.out = len)[ypos], it might be a problem in a dependency package. The report.Rmd file is available here if you want to check it out, but there isn't anything there, particularly at lines 60-70. It's difficult to help much without a minimal reproducible example, and if you say this is the only dataset which gives you this error, please make sure that it is formatted correctly. You might also try following the step-by-step instructions as detailed here to try to narrow down what the problem might be. Please let me know if you progress further in this issue.

gourivadali commented 3 years ago

Hi, Thank you for your response. I am using TCGA pancreas dataset. The command I am using is -

_cemitool(pancreas_count_cemi, sample.annot, gmt_in, interactions=int_df, apply_vst = TRUE, filter_pval = 0.05, gsea_minsize = 7, filter=TRUE, plot=TRUE, verbose = TRUE).

This command runs and creates my cemitool object. It is when I am trying to generate report that it gives me that error. I believe I have installed all dependency packages. I haven't seen any error because of that.

Thanks for providing the step-by-step instructions. I will try it out.

-Gouri

gourivadali commented 3 years ago

I just checked to see where the error might be and looks like its the section under "Module enrichment's" plot_gsea function that is throwing this error -

show_plot(cem_tcga_pancreas, "gsea") $enrichment_plot Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : replacement has length zero In addition: Warning message: In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL

Do you happen to have any other suggestions to solve this issue?

-Gouri

pedrostrusso commented 3 years ago

Looks like the enrichment plot isn't being created correctly. What is the output of gsea_data(cem_tcga_pancreas) ?

gourivadali commented 3 years ago

It outputs a list of 3 -

$ es :'data.frame': 6 obs. of 2 variables: ..$ pathway : chr [1:6] "M1" "M2" "M3" "M4" ... ..$ Primary_Tumor: num [1:6] 0.209 -0.182 0.18 -0.206 -0.143 ... $ nes :'data.frame': 6 obs. of 2 variables: ..$ pathway : chr [1:6] "M1" "M2" "M3" "M4" ... ..$ Primary_Tumor: num [1:6] 1.151 -0.951 0.961 -1.048 -0.591 ... $ padj:'data.frame': 6 obs. of 2 variables: ..$ pathway : chr [1:6] "M1" "M2" "M3" "M4" ... ..$ Primary_Tumor: num [1:6] 0.469 0.756 0.756 0.756 0.991 ...

-Gouri

pedrostrusso commented 3 years ago

Hi Gouri, would you mind outputting the entire list? I'm trying to see if there are any NA values or things like that that could explain the problem. Thanks.

gourivadali commented 3 years ago

Hi,

I noticed that my annotation file had only 1 phenotypic label in my "Class" column and it was returning this error. When I added two labels in my "Class" column, the program ran smoothly. My question is, do we need to have >1 condition in the "Class" column in the sample annotation file?

gourivadali commented 3 years ago

Hi Gouri, would you mind outputting the entire list? I'm trying to see if there are any NA values or things like that that could explain the problem. Thanks.

Please see -

$es pathway Primary_Tumor Solid_Tissue_Normal Metastatic 1 M1 0.2093344 NA NA 2 M2 -0.1822274 NA NA 3 M3 0.1795559 NA NA 4 M4 -0.2056644 NA NA 5 M5 -0.1427583 NA NA 6 Not.Correlated -0.2246011 NA NA

$nes pathway Primary_Tumor Solid_Tissue_Normal Metastatic 1 M1 1.1504399 NA NA 2 M2 -0.9513640 NA NA 3 M3 0.9607127 NA NA 4 M4 -1.0483097 NA NA 5 M5 -0.5912721 NA NA 6 Not.Correlated -0.9258506 NA NA

$padj pathway Primary_Tumor Solid_Tissue_Normal Metastatic 1 M1 0.4696970 NA NA 2 M2 0.7560000 NA NA 3 M3 0.7560000 NA NA 4 M4 0.7560000 NA NA 5 M5 0.9909747 NA NA 6 Not.Correlated 0.7560000 NA NA

pedrostrusso commented 3 years ago

Hey, I'm glad you got it to run. As for your question, "My question is, do we need to have >1 condition in the "Class" column in the sample annotation file?", no, you can run CEMiTool normally with just one class in the sample annotation file for whatever reason. If I were to take a wild guess, I'd say maybe your "Class" column in your sample annotation file is a factor column, however there were no other values except for "Primary_Tumor", so that's why the other levels ("Solid_Tissue_Normal" and "Metastatic") still appear in the list you output above, with NA values. Anyway, thanks for posting the issue, this seems rather one-offish so I'm going to close the issue for now, might revisit it in the future if it becomes more prevalent.

gourivadali commented 3 years ago

Hey, I'm glad you got it to run. As for your question, "My question is, do we need to have >1 condition in the "Class" column in the sample annotation file?", no, you can run CEMiTool normally with just one class in the sample annotation file for whatever reason. If I were to take a wild guess, I'd say maybe your "Class" column in your sample annotation file is a factor column, however there were no other values except for "Primary_Tumor", so that's why the other levels ("Solid_Tissue_Normal" and "Metastatic") still appear in the list you output above, with NA values. Anyway, thanks for posting the issue, this seems rather one-offish so I'm going to close the issue for now, might revisit it in the future if it becomes more prevalent.

Hi,

Unfortunately, the output I printed was not from the right cemitool object. The issue unfortunately persists. The output from the right cemitool object for the gsea_data(cem) is -

$es pathway Primary_Tumor 1 M1 -0.1074294 2 M2 -0.2091895 3 M3 0.2289775 4 M4 -0.2381389 5 M5 -0.2103246 6 M6 -0.2831983 7 M7 0.2194988

$nes pathway Primary_Tumor 1 M1 -0.5741099 2 M2 -1.1144725 3 M3 1.2427256 4 M4 -1.2121054 5 M5 -0.9502543 6 M6 -1.0740591 7 M7 0.7589697

$padj pathway Primary_Tumor 1 M1 1.00000000 2 M2 0.30187722 3 M3 0.08238788 4 M4 0.20160000 5 M5 0.79718805 6 M6 0.56721915 7 M7 0.99895833

I am still stuck at generating report unfortunately.

ruchiups commented 3 years ago

@pedrostrusso I am having the same issue and for the same reason- Quitting from lines 63-73 (report.Rmd) Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : replacement has length zero I have just one class variable in the sample column hence gsea plot is not getting produced although the table is. Is there some way to suppress/ignore the gsea plot so that the report can be generated?

Thanks in advance for your help!

pedrostrusso commented 3 years ago

Hi @ruchiups you can set plot = FALSE in your cemitool call, and then manually place your desired plots, for example: cem <- plot_ora(cem, gmt_in), cem <- plot_profile(cem), etc.

ruchiups commented 3 years ago

Yes, of course! such a simple solution and it worked perfectly. Thank you very much @pedrostrusso !