caravagnalab / CNAqc

CNAqc - Copy Number Alteration (CNA) Quality Check package
GNU General Public License v3.0
17 stars 8 forks source link

Unable to annotate variants #28

Closed pbousquets closed 11 months ago

pbousquets commented 12 months ago

Hello,

I've been using CNAqc for a while with no problems. However, I just tested the function annotate_variants for the first time and found that it crashes before it can end.

There seems to be an issue with reference patches, though I made sure I have no patches in my input. They might be appearing from an internal object generated during the annotation.

This issue was encountered while analyzing two WGS (60x) tumor-normal pairs, and it comes out with both of them

✔ Preparing mutations ... done
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
'select()' returned many:1 mapping between keys and columns
✔ Locating variants with VariantAnnotation ... done
✔ Traslating Entrez ids ... done
✔ Transforming data ... done

── Coding substitutions found 
✔ Predicting coding ... done
✔ Drivers annotation ... done
`summarise()` has grouped output by 'chr', 'from'. You can override using the `.groups` argument.
Warning messages:
1: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 953 out-of-bound ranges located on sequences 21230, 21233, 11270, 23205, 35581, 35582, 35583, 35586, 35587, 27016, 27020,
  37496, 37501, 37502, 37511, 37930, 39656, 43370, 43382, 43383, 43384, 43385, 47481, 47482, 48633, 61078, 74576, 81614, 81615, 82016, 90988, 90989,
  97803, 92468, 98922, 98923, 98924, 98926, 93172, 93173, 99190, 99191, 99192, 99193, 96033, 96034, 96039, 96040, 96051, 96053, 96054, 96055, 96059,
  102446, 102447, 102450, 102451, 102455, 102457, 102458, 102459, 102460, 102463, 102464, 102465, 102997, 102998, 103174, 97597, 97601, 103907, 103908,
  109272, 116018, 116019, 116020, 116023, 116026, 116028, 134306, 134307, 150288, 152494, 159655, 163110, 163111, 163112, 163113, 163114, 163116, 163117,
  164992, 170986, 170987, 170988, 175497, 183016, 184164, 184178, 184787, 184788, 184789, 184790, 184792, 184796, 184800, 184801, 184802, 184803, 184808,
  184816, 184824, 184825, 184826, 191063, 192095, 192096, 205883, 210788, 217649, 221435, 230411 [... truncated]
2: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 222 out-of-bound ranges located on sequences chr1_GL383518v1_alt, chr1_KI270762v1_alt, chr2_GL383522v1_alt,
  chr2_KI270774v1_alt, chr3_KI270777v1_alt, chr3_KI270781v1_alt, chr4_GL000257v2_alt, chr4_KI270788v1_alt, chr5_GL339449v2_alt, chr5_KI270795v1_alt,
  chr5_KI270898v1_alt, chr6_GL000250v2_alt, chr6_GL000254v2_alt, chr6_KI270797v1_alt, chr6_KI270798v1_alt, chr6_KI270801v1_alt, chr7_GL383534v2_alt,
  chr7_KI270803v1_alt, chr7_KI270806v1_alt, chr7_KI270809v1_alt, chr8_KI270815v1_alt, chr9_GL383540v1_alt, chr9_GL383541v1_alt, chr9_GL383542v1_alt,
  chr9_KI270823v1_alt, chr10_GL383546v1_alt, chr11_KI270831v1_alt, chr11_KI270902v1_alt, chr12_GL383551v1_alt, chr12_GL383553v2_alt,
  chr12_KI270834v1_alt, chr13_KI270838v1_alt, chr14_KI270847v1_alt, chr15_KI270848v1_alt, chr15_KI270850v1_alt, chr15_KI270851v1_alt,
  chr15_KI270906v1_alt, chr16_GL383556v1_alt, chr16_GL383557v1_alt, chr16_KI270854v1_alt, chr17_JH159146v1_alt, chr17_JH159147v1_alt,
  chr17_KI270857v1_a [... truncated]
3: In UseMethod("depth") :
  no applicable method for 'depth' applied to an object of class "NULL"
4: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 945 out-of-bound ranges located on sequences 21230, 21233, 11270, 23205, 35581, 35582, 35583, 35586, 35587, 27016, 27020,
  37496, 37501, 37502, 37511, 37930, 39656, 43370, 43382, 43383, 43384, 43385, 47481, 47482, 48633, 61078, 74576, 81614, 81615, 82016, 90988, 90989,
  97803, 92468, 98922, 98923, 98924, 98926, 93172, 93173, 99190, 99191, 99192, 99193, 96033, 96034, 96039, 96040, 96051, 96053, 96054, 96055, 96059,
  102446, 102447, 102450, 102451, 102455, 102457, 102458, 102459, 102460, 102463, 102464, 102465, 102997, 102998, 103174, 97597, 97601, 103907, 103908,
  109272, 116018, 116019, 116020, 116023, 116026, 116028, 134306, 134307, 150288, 152494, 159655, 163110, 163111, 163112, 163113, 163114, 163116, 163117,
  164992, 170986, 170987, 170988, 175497, 183016, 184164, 184178, 184787, 184788, 184789, 184790, 184792, 184796, 184800, 184801, 184802, 184803, 184808,
  184816, 184824, 184825, 184826, 191063, 192095, 192096, 205883, 210788, 217649, 221435, and 23 [... truncated]
5: In dplyr::left_join(loc_df, output_coding, by = c("chr", "from",  :
  Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1982 of `x` matches multiple rows in `y`.
ℹ Row 3 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship = "many-to-many"` to silence this warning.
caravagn commented 11 months ago

Thanks, tagging @Militeee that developed this bits of code.

Militeee commented 11 months ago

Hi @pbousquets ,

I had a look at the function at there was indeed a bug that resulted in an empty join. We changed the actual representation of SNVs from to - from = 1 (so length 1 interval) to to - from = 0 (so length 0 interval). I pushed a fix. I don't know if it also solves your problem, but it is worth trying to reinstall the package and rerun it.

In case you still have trouble with the function, it would be extremely useful if you could provide an example dataset to reproduce the error, I'll try to fix it asap.

Just a last thing, driver annotation is generally hard, this function gives you a spotlight on coding (non-synonymous, stop gain and frameshift mainly) mutations in cancer genes, but you will need more sophisticated approaches to actually call putative drivers among them.

Cheers, S.

pbousquets commented 11 months ago

Hi @Militeee,

Thank you very much for having a look at it. I'll give it a try and let you know if it worked ASAP. If the new version didn't work I'll send you a dataset to reproduce the problem. Thank you very much!

Pablo

pbousquets commented 11 months ago

Hi again, @Militeee ,

I just tested the bugfix and it perfectly worked. Thank you very much for your quick help!

Cheers, P.