caravagnalab / CNAqc

CNAqc - Copy Number Alteration (CNA) Quality Check package
GNU General Public License v3.0
17 stars 8 forks source link

Select desidered karyotypses for phasing mutations #35

Closed nicola-calonaci closed 5 months ago

nicola-calonaci commented 5 months ago

At the moment the only option for selecting karyotypes in the phasing procedure is a cut-off on the minimum number of mutations. It would be useful for the user to additionally be able to select desired karyotypes.

For instance, I want to focus on diploid homozygous (karyotype "2:0") segments and find the multiplicity of mutations thereof.

caravagn commented 5 months ago

Subset by karyotype first and then compute is not a good option? It's a single function call.

nicola-calonaci commented 5 months ago

True but function "subset_by_segment_karyotype" removes the "peaks_analysis" from the CNAqc object. Is that intentional?

nicola-calonaci commented 5 months ago

This is my proposal for a very simple solution:

advanced_phasing = function(x, cutoff_n = 50, karyotypes = NULL)
{
  karyotypes_cutoff = x$n_karyotype[x$n_karyotype >= cutoff_n] %>% names()

  if(is.null(karyotypes)) {
    karyotypes = karyotypes_cutoff
  } else{
    karyotypes = intersect(karyotypes, karyotypes_cutoff)
    }

basically it intersects the selected karyotypes with the ones that pass the cutoff on mutation number.

caravagn commented 5 months ago

I am sorry but I am not in favour of this because there is already a very neat way of getting the same and this means that this is unnecessary, and therefore unrequired.

x %>% 
   subset_by_segment_karyotype("2:0") %>% 
   advanced_phasing()

Instead, dropping analyses results is done for the same logic: avoiding that we have to manage complicated scenarios.

Example:

How do you manage plots? And getters? etc.