sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
254 stars 48 forks source link

pcgr.R error: `'by' can't contain join column 'GENOMIC_CHANGE'` #33

Closed vladsavelyev closed 6 years ago

vladsavelyev commented 6 years ago

Hi Sigve,

Thanks a lot for pushing the new release with all the fixes!

I'm running into another one unfortunately. pcgr.R fails with the following error:

pcgr.R E145__PRJ180035_E145-T01-D-somatic.pcgr_acmg.pass.tsv.gz None E145__PRJ180035_E145-T01-D-somatic pcgr_configuration_somatic.toml 0.6.2 grch37
...
2018-05-10 13:41:47 [INFO] Generating data for tiered cancer genome report - somatic calls tier model pcgr_acmg'
2018-05-10 13:41:47 [INFO] Number of protein-coding variants: 13
2018-05-10 13:41:48 [INFO] Looking up SNV/InDel biomarkers for precision oncology - any tumortype
2018-05-10 13:41:48 [INFO] 6 clinical evidence item(s) found .. (1 unique variant(s)), mapping = exact
2018-05-10 13:41:48 [INFO] Underlying variant(s):
2018-05-10 13:41:48 [INFO] VHL missense_variant missense_variant:ENST00000256474.2:c.482G>A:exon3:p.R161Q 3:g.10191489G>A
2018-05-10 13:41:48 [INFO] 0 clinical evidence item(s) found .. mapping = codon
2018-05-10 13:41:48 [INFO] 0 clinical evidence item(s) found .. mapping = exon
Error: `by` can't contain join column `GENOMIC_CHANGE` which is missing from RHS
Execution halted

Example VCF and TOML are attached below: test_saveliev.tar.gz

In fact I'm having this error when running pcgr.R outside of the docker. The dockerized one failures with "cannot allocated memory" for some reason regardless how much memory I allocate:

docker run -m 4G --memory-swap 4G --rm -t -u root -v=/Users/vsaveliev/git/pcgr:/data -v=/Users/vsaveliev/git/pcgr/data/grch37/.vep:/usr/local/share/vep/data -v=/Users/vsaveliev/git/umccr/umccrise_test_data/results/bcbio_test_project/cup__cup_tissue/pcgr/input:/workdir/input_vcf -v=/Users/vsaveliev/git/umccr/umccrise/umccrise/pcgr:/workdir/input_conf -v=/Users/vsaveliev/git/umccr/umccrise_test_data/results/bcbio_test_project/cup__cup_tissue/pcgr:/workdir/output -w=/workdir/output sigven/pcgr:0.6.2 sh -c "/pcgr.R /workdir/output /workdir/output/cup__cup_tissue-normal.pcgr_acmg.grch37.pass.tsv.gz None cup__cup_tissue-normal /workdir/input_conf/pcgr_configuration_normal.toml 0.6.2 grch37 /data/"
...
2018-05-10 03:24:54 [INFO] Assigning elements to PCGR value boxes
Error in system(paste(which, shQuote(names[i])), intern = TRUE, ignore.stderr = TRUE) :
  cannot popen '/usr/bin/which 'pandoc' 2>/dev/null', probable reason 'Cannot allocate memory'
Calls: <Anonymous> ... pandoc_available -> find_pandoc -> find_program -> Sys.which
Execution halted

So the error in fact might even originate from an unexpected dependency version on my local machine versus the docker image. So might be just might fault. Will explore on this a bit more, but if you can try out this example and share your thoughts, it would be great!

Vlad

vladsavelyev commented 6 years ago

Figured out the cannot allocate memory issue :) Should have re-created the docker machine with a higher memory limit: docker-machine create default --driver virtualbox --virtualbox-memory 4096

vladsavelyev commented 6 years ago

I guess I figured the difference between the local and the dockerized environments: the code for MSI calculations might be different, because for another sample, the dockerized instance reports 0.057:

2018-05-10 08:15:47 [INFO] Predicting microsatellite instability status
2018-05-10 08:15:47 [INFO] n = 43 coding variants used for MSI prediction
2018-05-10 08:15:53 [INFO] Predicted MSI status: MSS (Microsatellite stable)
2018-05-10 08:15:53 [INFO] MSI - Indel fraction: 0.057

Whereas the local one reports 0.165:

2018-05-10 18:27:42 [INFO] Predicting microsatellite instability status
2018-05-10 18:27:50 [INFO] Predicted MSI status: MSI.H (Microsatellite instability - high)
2018-05-10 18:27:50 [INFO] MSI - Indel fraction: 0.165
vladsavelyev commented 6 years ago

I'm not sure what I did, but now it seem to produce identical result and no longer fails with the original error. All good!