Error in dplyr:semi_join() when only numeric chromosomes in input #250

Apologies for opening yet another issue, but I've encountered a small error with pcgr. I believe this is related to our test dataset, which is limited to chromosome 19, and the way that R reads in numeric columns. When using the --vcf2maf flag in the command, the following error occurs:

Error in `dplyr::semi_join()`:
! Can't join `x$Chromosome` with `y$Chromosome` due to incompatible
ℹ `x$Chromosome` is a <double>.
ℹ `y$Chromosome` is a <character>.
Execution halted

I think this might be happening because the Chromosome column in this case only contains "19", which is leading R to read this column in as numeric, instead of as a character. I realize this is a bit of a niche issue, but others may come across it if their datasets don't contain data on chr X and Y.

I'm using pcgr v2.0.3 with the following command:

pcgr --force_overwrite --vep_buffer_size 500 --vep_no_intergenic --vcf2maf --tumor_site 0 --assay WES --tumor_only --tumor_dp_tag TDP --tumor_af_tag TVAF --tumor_dp_min 10 --tumor_af_min 0.03 --exclude_likely_hom_germline --exclude_likely_het_germline --exclude_dbsnp_nonsomatic --exclude_nonexonic --input_vcf alignment/HCC1395/HCC1395.hc.vt.annot.flt.vcf.gz --refdata_dir $PCGR_DATA --vep_dir $PCGR_VEP_CACHE --output_dir alignment/HCC1395/pcgr --genome_assembly grch38 --sample_id HCC1395 --debug

The same command runs fine with the example data and it also finishes successfully with our test dataset if I omit the --vcf2maf flag. I'm attaching the input file used here. HCC1395.hc.vt.annot.flt.vcf.gz

Best, Mareike

Thanks a lot, Mareike! Never be sorry for filing issues, these are very useful. I think you identified what needs to be fixed, I'll just try to reproduce it first, and make a patch for it. And such bugs clearly hints to some more robust testing procedures before moving to release. ;-) . Anyways, I'll make a fix, prob next week, also working on some other upgrades (CNA plot, germline integration etc).

Thanks again!

best, Sigve

Hi Mareike,

I've re-run your sample with the upcoming version (2.1.0, also with updated reference data):

Results availabe here:

Sorry for the delay:)

best, Sigve

Hi Sigve,

Thanks for the update and for all of the updates in the upcoming version! We're looking forward to testing it out.

Best, Mareike