csbl-usp / CEMiTool

Co-Expression Module Identification Tool (CEMiTool) official repository
22 stars 10 forks source link

Running CEMiTool with multiple factors/conditions #62

Closed akcorut closed 3 years ago

akcorut commented 3 years ago

Hi all,

First of all, thank you for making a very useful package. I have a question regarding to running cemitool on a dataset that contains multiple conditions/groups. I'm trying to use your tool on an expression data with a sample annotation file like below:

Sample Annotation

SampleName | Tissue | Location | Genotype | Time_point | Class -- | -- | -- | -- | -- | -- IMAX | meristem | n | GTC20 | T1 | GTC20_T1_n IMAC | meristem | n | GTC20 | T1 | GTC20_T1_n ILZH | meristem | n | GTC20 | T1 | GTC20_T1_n IMBR | meristem | n | Tifrunner | T1 | Tifrunner_T1_n ILXT | meristem | n | Tifrunner | T1 | Tifrunner_T1_n ILWZ | meristem | n | Tifrunner | T1 | Tifrunner_T1_n IMAD | meristem | n | GTC20 | T2 | GTC20_T2_n ILZI | meristem | n | GTC20 | T2 | GTC20_T2_n ILYN | meristem | n | GTC20 | T2 | GTC20_T2_n IMBS | meristem | n | Tifrunner | T2 | Tifrunner_T2_n IMAY | meristem | n | Tifrunner | T2 | Tifrunner_T2_n ILXA | meristem | n | Tifrunner | T2 | Tifrunner_T2_n IMBX | meristem | n1 | GTC20 | T2 | GTC20_T2_n1 IMBC | meristem | n1 | GTC20 | T2 | GTC20_T2_n1 IMAH | meristem | n1 | GTC20 | T2 | GTC20_T2_n1 ILYT | meristem | n1 | Tifrunner | T2 | Tifrunner_T2_n1 ILXZ | meristem | n1 | Tifrunner | T2 | Tifrunner_T2_n1 ILXE | meristem | n1 | Tifrunner | T2 | Tifrunner_T2_n1 ILZJ | meristem | n | GTC20 | T3 | GTC20_T3_n ILYP | meristem | n | GTC20 | T3 | GTC20_T3_n ILXU | meristem | n | GTC20 | T3 | GTC20_T3_n IMBT | meristem | n | Tifrunner | T3 | Tifrunner_T3_n IMAZ | meristem | n | Tifrunner | T3 | Tifrunner_T3_n IMAE | meristem | n | Tifrunner | T3 | Tifrunner_T3_n IMBD | meristem | n1 | GTC20 | T3 | GTC20_T3_n1 IMAI | meristem | n1 | GTC20 | T3 | GTC20_T3_n1 ILZN | meristem | n1 | GTC20 | T3 | GTC20_T3_n1 IMBY | meristem | n1 | Tifrunner | T3 | Tifrunner_T3_n1 ILYA | meristem | n1 | Tifrunner | T3 | Tifrunner_T3_n1 ILXF | meristem | n1 | Tifrunner | T3 | Tifrunner_T3_n1 .... *66 samples total

I have created a Class column in the sample annotation file (as you can see above) by merging all factors into one column and currently I am using that column as my class_column. I was wondering if you think this would be the best way to run cemitool on this dataset or are there any other ways to produce more meaningful results with this type of dataset?

I'm running cemitool as below:

# Perform CEMiTool
cem_all_meristem <- cemitool(count_data_meristem, samples_meristem, filter=TRUE, filter_pval = 0.05, apply_vst= TRUE, verbose=TRUE)

Thank you for your time! Kivanc

pedrostrusso commented 3 years ago

Hi @akcorut, yeah, unfortunately CEMiTool isn't able to handle multiple factors like that. Your approach will probably work, but I can't vouch for the results you'll get, with such a low # of samples per class.

akcorut commented 3 years ago

Thanks for the reply @pedrostrusso. That's what I was suspecting. Based on the results I got so far, that approach doesn't seem to be giving meaningful results.

Thank you for your time.