xlucpu / MOVICS

Multi-Omics integration and VIsualization in Cancer Subtyping
Other
133 stars 42 forks source link

a conda version of this package? #4

Closed worker000000 closed 3 years ago

worker000000 commented 3 years ago

Thanks for such a wonderful package. is there a conda version and can it be installed in R3.6.3. I see your manual docs saying it needs R4.0.1

xlucpu commented 3 years ago

Hi, there is no conda version and you must have R greater than 4.0.1 since we got dependencies need this R version.

worker000000 commented 3 years ago

thanks a lot, really cool. I tried a lot of time to install it in the morning and night. but is always has errors like this, can you help me image


does it mean I need to install_local for this 2 github repo also?

worker000000 commented 3 years ago

thanks a lot by the way, 1 if I only have rna-seq or other seq, can I use this tool? 2 and I guess whether you have calculated everything before , I just need to prepare the raw fpkm and clinical data? is there a more detailed about the

xlucpu commented 3 years ago

thanks a lot, really cool. I tried a lot of time to install it in the morning and night. but is always has errors like this, can you help me image

does it mean I need to install_local for this 2 github repo also?

I suggest you installed this package by install_github instead of using install_local. Since you have a lot of dependencies to install first, you can refer to DISCRIPTION file to check those dependencies and install them first. Because if any dependency failed to be installed, MOVICS will fail.

xlucpu commented 3 years ago

thanks a lot by the way, 1 if I only have rna-seq or other seq, can I use this tool? 2 and I guess whether you have calculated everything before , I just need to prepare the raw fpkm and clinical data? is there a more detailed about the

If you would like to use MOVICS to perform multi-omics clustering, then at least two omics data should be provided. But if you have your subtypes derived from this FPKM data using other single-omics approach (i.e., k-means, hcluts etc), you can refer to the Little Trick section of the HTML vignette to see how to perform downstream analysis.

worker000000 commented 3 years ago

thanks a lot so you mean this tools can be used for single omics?.

install_gitjhub and install_local will not change the dependency except for downloading files of the repo. I have used way like conda install r-usethis to install the required package, but the donloaded github repo is hard.

image
this is hard to me

xlucpu commented 3 years ago

thanks a lot so you mean this tools can be used for single omics?.

install_gitjhub and install_local will not change the dependency except for downloading files of the repo

it seems that you fail to install dependencies that stored in github (CMScaller, complexheatmap), maybe you could first install these packages seperately. Yes, the downstream analyses are independently from multi-omics clustering if you can make eligible file to deceive these functions in COMP and RUM modules. However, if you want to get subtypes from MOVICS, at least two omics data should be provided.

worker000000 commented 3 years ago

will two omics affect each other? thanks a lot

xlucpu commented 3 years ago

will two omics affect each other? thanks a lot

That what we called multi-omics integrative clustering. We hope them affect each other.

xlucpu commented 3 years ago

thanks a lot so you mean this tools can be used for single omics?.

install_gitjhub and install_local will not change the dependency except for downloading files of the repo. I have used way like conda install r-usethis to install the required package, but the donloaded github repo is hard.

image

this is hard to me

What operating system are you using to install MOVICS, I only test MOVICS in windows and MacOS.

worker000000 commented 3 years ago

Linux version 4.14.181-140.257.amzn2.x86_64 (xxx@ip-xx-x-xxx-xx) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-6) (GCC))
thanks a lot

xlucpu commented 3 years ago

Linux version 4.14.181-140.257.amzn2.x86_64 (xxx@ip-xx-x-xxx-xx) (gcc version 7.3.1 20180712 (Red Hat 7.3.1-6) (GCC))

thanks a lot

Sorry I am not sure if you can install MOVICS in Linux since I am not sure if all the dependencies can be installed successfully in this system. If you have alternative system, I suggest you used this package in Windows or MacOS.

worker000000 commented 3 years ago

thanks a lot. because I guess it will use a lot of resource, so I install it in linux. have you test the required resource about a true Breat multiomics instead of toy data in the manual , and how much time it will give the final result.
by the way , for single omics subtepy, which cluster method do you suggest?

xlucpu commented 3 years ago

thanks a lot. because I guess it will use a lot of resource, so I install it in linux. have you test the required resource about a true Breat multiomics instead of toy data in the manual , and how much time it will give the final result.

by the way , for single omics subtepy, which cluster method do you suggest?

The demo data is the true data of breast cancer. If might take 2-3 hours to finish in Windows system (16G). I suggest consensus hclust or NMF for single omics.

worker000000 commented 3 years ago

thanks a lot. I mean breast tcga is a big data, if not the selected, because you

first pre-processed to extract top 500 mRNAs, 500 lncRNA, 1,000 promoter CGI probes/genes with high variation using
statistics of median absolute deviation (MAD), and 30 genes that mutated in at least 3% of the entire cohort. 


but if I do the same as you, I guess tyhe editor may ask me why I select like this, so should I input all the data? or do you have some experience about selecting.

2 have you tried to use all the data, how is the time cost and resource?

xlucpu commented 3 years ago

thanks a lot. I mean breast tcga is a big data, if not the selected, because you

first pre-processed to extract top 500 mRNAs, 500 lncRNA, 1,000 promoter CGI probes/genes with high variation using
statistics of median absolute deviation (MAD), and 30 genes that mutated in at least 3% of the entire cohort. 

but if I do the same as you, I guess tyhe editor may ask me why I select like this, so should I input all the data? or do you have some experience about selecting. 2 have you tried to use all the data, how is the time cost and resource?

Using all data is not necessary for clustering analysis, many constant or flat values will be overwhelmed by other "useful" signals. I did not test the time cost but if you use all data, I can asure you that will take a rather long time.

worker000000 commented 3 years ago

thanks a lot. so do you have some experience about selecting data. you know likle deep learing, it does not care input , it will be the more the better, but notr caring constant or flat values

xlucpu commented 3 years ago

thanks a lot. so do you have some experience about selecting data. you know likle deep learing, it does not care input , it will be the more the better, but notr caring constant or flat values

Selecting data has been included in MOVICS.

worker000000 commented 3 years ago

thanks a lot, so here do you mean I can input all the unprocessed data. for example, I do not need to select the significant genes of fpkm or remain all the other data?

xlucpu commented 3 years ago

thanks a lot, so here do you mean I can input all the unprocessed data. for example, I do not need to select the significant genes of fpkm or remain all the other data?

Yes you can but I wont recommend for this.

worker000000 commented 3 years ago

so here you mean how should I prepare the input data, I am a little confused about what

recommend for this.

the this refer to which

thanks a lot

xlucpu commented 3 years ago

so here you mean how should I prepare the input data, I am a little confused about what

recommend for this.

the this refer to which

thanks a lot

Please refer to vignette and use getElites to prepare input data which meets your clinical or biological background.

worker000000 commented 3 years ago

thanks a lot