gabrielodom / pathwayPCA

integrative pathway analysis with modern PCA methodology and gene selection
https://gabrielodom.github.io/pathwayPCA/
11 stars 2 forks source link

Add options to read_gmt, write_gmt, and CreatePathwayCollection #70

Closed gabrielodom closed 5 years ago

gabrielodom commented 5 years ago

To support coMethDMR and rnaEditr, the read/write functions for .gmt files need to be more flexible. Instead of hard-coding "pathwayXX" as the names of the first list element, we need to give the option to have "pathway", "gene", or "region".

This is per @lxw391's request.

gabrielodom commented 5 years ago

@lxw391, gene symbols or IDs are the units of a pathway. What are the units of a region, CpGs? What are the units of a gene?

gabrielodom commented 5 years ago

We will add the argument setType = c("pathway", "gene", "region") to the read_gmt() and CreatePathwayCollection() functions.

We can replace the internal pathways object with an object called sets_ls.

lissettegomez commented 5 years ago

I get this error for write_gmt: `

out_CloseByRegions Object with Class(es) 'pathwayCollection', 'list' [package 'pathwayPCA'] with 2 elements: $ regions:List of 1966 $ TERMS : chr [1:1966] "chr10:100993553-100993597" ... write_gmt(out_CloseByRegions, file = fileName, setType = "regions") Error in write_gmt(out_CloseByRegions, file = fileName, setType = "regions") : Number of sets should match number of TERMS. `

I think the problem is in this part of the code: sets_ls <- pathwayCollection[setType] nPaths <- length(sets_ls) if(nPaths != length(TERMS_char)){ stop("Number of sets should match number of TERMS.") }

It seems nPaths will always be 1

gabrielodom commented 5 years ago

Can you send me the data and script? Also, nPaths should never be 1. It should be 1966.

lissettegomez commented 5 years ago

This is using the example for write_gmt function: `

toy_pathwayCollection Object with Class(es) 'pathwayCollection', 'list' [package 'pathwayPCA'] with 3 elements: $ pathways :List of 3 $ TERMS : chr [1:3] "C-or-f_paths" ... $ description: chr [1:3] "these are" ... write_gmt(toy_pathwayCollection, file = "example_pathway.gmt") Error in write_gmt(toy_pathwayCollection, file = "example_pathway.gmt") : Number of sets should match number of TERMS.`

gabrielodom commented 5 years ago

Thank you. Please try it now. @lizhongliu1996, please add this check to the testing script for write_gmt(). Please close this issue when you are finished.

lizhongliu1996 commented 5 years ago

Bug fixes, added unit test and pass the test based on @lissettegomez , close issue