caravagnalab / rcongas

rcongas
GNU General Public License v3.0
7 stars 1 forks source link

DE_table inconsistent #4

Closed caravagn closed 3 years ago

caravagn commented 3 years ago

Problem:

  # Load DE results - forward params
  DE_table = get_DE_table(x, chromosomes = chromosomes, ...)

  # Get gene locations
  x$data$gene_locations %>%
    filter(gene %in% DE_table$gene)

DE_table contains many genes, when I subset given gene locations I loose multiple genes. There is some problem with the way genes are stored across the various data structions

caravagn commented 3 years ago

I mean, all genes in DE_table should have a mapping in x$data$gene_locations no?

Militeee commented 3 years ago

Here the situation is a bit tricky. Basically you are testing DE on a lot more genes that the ones you use in the CONGAS analysis. The main reason is that not all the genes you have in the expression matrix map to a valid segment (either because your segmentation is incomplete or too fragmented in a specific position). Other genes may be filtered in the pre-processing step; however, you would ideally like to include everything in the DE analysis.