cancerit / cgpBattenberg

Battenberg algorithm and associated implementation script
GNU Affero General Public License v3.0
51 stars 16 forks source link

Option 'process' is not an expected process type: output . at /opt/wtsi-cgp/lib/perl5/PCAP/Cli.pm line 90. #123

Open jamesdalg opened 2 years ago

jamesdalg commented 2 years ago

I'm having issues running cpgBattenberg and the error is not informative:

bash-4.2$ module load cgpBattenberg
[-] Unloading cgpBattenberg  3.3.1  on cn0940
[+] Loading cgpBattenberg  3.5.3  on cn0940
[-] Unloading singularity  3.8.5  on cn0940
[+] Loading singularity  3.8.5  on cn0940

The following have been reloaded with a version change:
  1) cgpBattenberg/3.3.1 => cgpBattenberg/3.5.3

bash-4.2$ battenberg.pl -p output \
>     -r /data/CCRBioinfo/dalgleishjl/sv_mapping/hg38_ref/hg38.fa.fai \
>     -tb /data/CCRBioinfo/projects/TargetOsteo_WGS/bam_hg38/PAUTWB_T.bam \
>     -nb /data/CCRBioinfo/projects/TargetOsteo_WGS/bam_hg38/PAUTWB_N.bam \
>     -ge XY \
>     -impute-info /fdb/cancerit-wgs/cgpBattenberg/impute/impute_info.txt \
>     -thousand-genomes-loc /fdb/cancerit-wgs/cgpBattenberg/1000genomesloci \
>     -ignore-contigs-file /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/GRCh38.exclude.contigs.txt \
>     -gc-correction-loc /fdb/cancerit-wgs/cgpBattenberg/battenberg_wgs_gc_correction_1000g_v3 \
>     -species Human -assembly 38 \
>     -t $SLURM_CPUS_PER_TASK \
>     -outdir /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/
Option 'process' is not an expected process type: output
. at /opt/wtsi-cgp/lib/perl5/PCAP/Cli.pm line 90.
bash-4.2$

Looking at the PCAP/Cli.pm code at line 90(https://github.com/cancerit/PCAP-core/blob/develop/lib/PCAP/Cli.pm), as specified, it only says that it's breaking because it's not a valid process: image

I've also checked the PCAP github repository and there isn't an issue that resembles this on their end. This error doesn't point to any section of your code, but if someone with experience with the code can suggest where to start debugging, that would be wonderful and will enable me to move forward with subclonal reconstruction.

Thanks, James Dalgleish

keiranmraine commented 2 years ago

The error is indicating that you have specified the CLI option process as an invalid value. In your command you use the abbreviated form -p:

battenberg.pl -p output

The valid values are provided in the command line extended help (battenberg.pl -m):

https://github.com/cancerit/cgpBattenberg/blob/ca97ccfd5cbddf1a7826e3d80e14c442bc5b6c46/perl/bin/battenberg.pl#L656-L675

jamesdalg commented 2 years ago

Yes, I noticed that too. There was an error in my local cluster documentation that ended up going into that command. I fixed it, yet still have an error having to do with the Battenberg (capital B) package. `bash-4.2$ battenberg.pl \

-r /data/CCRBioinfo/dalgleishjl/sv_mapping/hg38_ref/hg38.fa.fai \
-tb /data/CCRBioinfo/projects/TargetOsteo_WGS/bam_hg38/PAUTWB_T.bam \
-nb /data/CCRBioinfo/projects/TargetOsteo_WGS/bam_hg38/PAUTWB_N.bam \
-ge XY \
-impute-info /fdb/cancerit-wgs/cgpBattenberg/impute/impute_info.txt \
-thousand-genomes-loc /fdb/cancerit-wgs/cgpBattenberg/1000genomesloci \
-ignore-contigs-file /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/GRCh38.exclude.contigs.txt \
-gc-correction-loc /fdb/cancerit-wgs/cgpBattenberg/battenberg_wgs_gc_correction_1000g_v3/ \
-species Human -assembly 38 \
--outdir /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/ \
-t $SLURM_CPUS_PER_TASK

Skipping Sanger_CGP_Battenberg_Implement_battenberg_splitlocifiles.0 as previously successful Skipping Sanger_CGP_Battenberg_Implement_battenberg_allelecount.1 as previously successful General output can be found in this file: /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.out Errors can be found in this file: /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.err ....... (several hundred more lines that look just like the above)

Wrapper script message: "/usr/bin/time bash /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.sh 1> /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.out 2> /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.err" unexpectedly returned exit value 1 at (eval 37) line 12. at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 235 bash-4.2$ cat /data/CCRBioinfo/dalgleishjl/sv_mapping/reconstruction/battenberg/PAUTWB/tmpBattenberg/logs/Sanger_CGP_Battenberg_Implement_battenberg_runbaflog.0.err

R version 4.1.0 (2021-05-18) -- "Camp Pontanezen" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

library(Battenberg) q() Save workspace image? [y/n/c]: n bash-4.2$ `

keiranmraine commented 2 years ago

You need to check the reference files you are pointing at are the GRCh38 ones:

/fdb/cancerit-wgs/cgpBattenberg/impute/impute_info.txt /fdb/cancerit-wgs/cgpBattenberg/1000genomesloci

It looks like you are using the pre-build docker image via singularity so the reference files seem the most likely culprit as they don't have a chr prefix:

chr_names=as.vector(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","X")), g1000file.prefix="/fdb/cancerit-wgs/cgpBattenberg/1000genomesloci/1000genomesAlleles2012_chr", minCounts=10,

Although the g1000file.prefix indicates 1000genomesAlleles2012_chr I see the chromosome name is appended with chr again in our local copies, e.g.

1000genomesAlleles2012_chrchr10.txt

All the above points to the GRCh37 Battenberg references being used with GRCh38 genome.fa.fai.

jamesdalg commented 2 years ago

I looked into the bam files and they have chr prefixes. Should I therefore strip them from the bam and reindex? Or are you saying just to use different reference files?

Here's the samtools view output of the header:

@HD     VN:1.4  SO:coordinate
@SQ     SN:chr1 LN:248956422
@SQ     SN:chr2 LN:242193529
@SQ     SN:chr3 LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     SN:chr6 LN:170805979
@SQ     SN:chr7 LN:159345973
@SQ     SN:chr8 LN:145138636
@SQ     SN:chr9 LN:138394717
@SQ     SN:chr10        LN:133797422
@SQ     SN:chr11        LN:135086622
@SQ     SN:chr12        LN:133275309
@SQ     SN:chr13        LN:114364328
@SQ     SN:chr14        LN:107043718
@SQ     SN:chr15        LN:101991189
@SQ     SN:chr16        LN:90338345
@SQ     SN:chr17        LN:83257441
@SQ     SN:chr18        LN:80373285
@SQ     SN:chr19        LN:58617616
@SQ     SN:chr20        LN:64444167
@SQ     SN:chr21        LN:46709983
@SQ     SN:chr22        LN:50818468
@SQ     SN:chrX LN:156040895
@SQ     SN:chrY LN:57227415
@SQ     SN:chrM LN:16569
@SQ     SN:chr1_KI270706v1_random       LN:175055
@SQ     SN:chr1_KI270707v1_random       LN:32032
@SQ     SN:chr1_KI270708v1_random       LN:127682
@SQ     SN:chr1_KI270709v1_random       LN:66860
@SQ     SN:chr1_KI270710v1_random       LN:40176
keiranmraine commented 2 years ago

Okay, that that possibly suggests the wrong reference set, I can't tell from what you've provided as the command as they just show as:

/fdb/cancerit-wgs/cgpBattenberg/impute/impute_info.txt

Where did you get these from?