Closed pwaltman closed 5 years ago
checking the code of the jba rscript that's in the repo, it creates an rds file called cmd.args.rds that stores a list with the 2 elements "junctions" and 'coverage', but it doesn't use those in the call to Jabba, i.e. here's the relelvant code snippet from the jba rscript, below. Shouldn't it be passing opt$coverage
instead of opt$abu
and opt$junctions
instead of junctions$ra
?
...
jmessage('Located junction file ', opt$junctions)
jmessage('Located coverage file ', opt$coverage)
jmessage('Loading packages ...')
writeLines(paste(paste("--", names(opt), " ", sapply(opt, function(x) paste(x, collapse = ",")), sep = "", collapse = " "), sep = ""), paste(opt$outdir, "cmd.args", sep = "/"))
saveRDS(opt, paste(opt$outdir, "cmd.args.rds", sep = "/"))
jab = JaBbA(coverage = opt$abu,
blacklist = opt$blacklist,
junctions = opt$ra,
juncs.uf = opt$ra.supp,
seg = opt$seg,
cfield = opt$cfield,
tfield = opt$tfield,
nseg = opt$nseg,
hets = opt$hets,
outdir = opt$outdir,
subsample = opt$subsample,
name = opt$name,
field = opt$field,
slack = opt$slack,
loose.penalty.mode = ifelse(opt$boolean, "boolean", "linear"),
strict = as.logical(opt$strict),
tilim = opt$tilim,
ploidy = opt$ploidy,
purity = opt$purity,
pp.method = opt$ppmethod,
verbose = 2,
mc.cores = opt$cores,
reiterate = opt$iterate,
rescue.window = opt$window,
edgenudge = opt$edgenudge,
nudge.balanced = opt$nudgebalanced,
indel = opt$indel,
all.in = opt$allin,
use.gurobi = opt$gurobi,
epgap = opt$epgap,
max.na = opt$maxna)
In an attempt to fix the issue, I modified the jba rscript so that it set opt$abu and opt$ra to opt$coverage and opt$junctions, respectively, but now, I'm getting a different error - that the junctions file doesn't contain any junctions (see below). However, I checked, and the test junctions file definitely contains breakpoints, so I'm at a bit of a loss at this point.
waltman@waltman-workstation:~/modules/JaBbA_simplified$ JaBbA/jba JaBbA/inst/extdata/junctions.vcf JaBbA/inst/extdata/coverage.txt
_____ ___ _ _____
(___ ) ( _`\ ( ) ( _ )
| | _ _ | (_) )| |_ | (_) |
_ | | /'_` )| _ <'| '_`\ | _ |
( )_| |( (_| || (_) )| |_) )| | | |
`\___/'`\__,_)(____/'(_,__/'(_) (_)
(Junction Balance Analysis)
JaBbA 2019-01-20 11:20:30: Located junction file JaBbA/inst/extdata/junctions.vcf
JaBbA 2019-01-20 11:20:30: Located coverage file JaBbA/inst/extdata/coverage.txt
JaBbA 2019-01-20 11:20:30: Loading packages ...
JaBbA 2019-01-20 11:20:31: Starting analysis in /ifs/scratch/c2b2/ac_lab/pw2470/devel/mskilab/flows/modules/JaBbA_simplified
JaBbA 2019-01-20 11:20:31: Read in 0 total junctions
JaBbA 2019-01-20 11:20:31: Empty junction input. Will do just one round of optimization.
JaBbA 2019-01-20 11:20:31: Importing seg from UCSC format
Error: scanVcf: 'indexname' must be character(1)
path: JaBbA/inst/extdata/junctions.vcf
In addition: There were 16 warnings (use warnings() to see them)
Execution halted
my fix:
opt = NULL;
if (!is.null(args)) {
opt = args$options
opt$junctions = args$args[1]
opt$coverage = args$args[2]
if (!file.exists(opt$junctions))
{
message('Did not find junction file ', opt$junctions)
print_help(parseobj); stop()
}
if (!file.exists(opt$coverage))
{
message('Did not find coverage file ', opt$coverage)
print_help(parseobj); stop()
}
if ( is.null( opt$abu ) ) opt$abu <- opt$coverage
if ( is.null( opt$ra ) ) opt$ra <- opt$junctions
} else {
stop("See usage.")
}
UPDATE: after reading the code, I realized that I should set the DEFAULT_GENOME & DEFAULT_BSGENOME environment variables to a seq_lengths file, but am now getting the following error when I use the provided test data:
waltman@waltman-workstation:~/modules/JaBbA_simplified$ JaBbA/jba JaBbA/inst/extdata/junctions.vcf JaBbA/inst/extdata/coverage.txt
_____ ___ _ _____
(___ ) ( _`\ ( ) ( _ )
| | _ _ | (_) )| |_ | (_) |
_ | | /'_` )| _ <'| '_`\ | _ |
( )_| |( (_| || (_) )| |_) )| | | |
`\___/'`\__,_)(____/'(_,__/'(_) (_)
(Junction Balance Analysis)
JaBbA 2019-01-20 12:21:21: Located junction file JaBbA/inst/extdata/junctions.vcf
JaBbA 2019-01-20 12:21:21: Located coverage file JaBbA/inst/extdata/coverage.txt
JaBbA 2019-01-20 12:21:21: genome env var /ifs/scratch/c2b2/ac_lab/shares/resources/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.seq_lengths
JaBbA 2019-01-20 12:21:21: bsgenome env var /ifs/scratch/c2b2/ac_lab/shares/resources/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.seq_lengths
JaBbA 2019-01-20 12:21:21: Loading packages ...
JaBbA 2019-01-20 12:21:22: Starting analysis in /ifs/scratch/c2b2/ac_lab/pw2470/devel/mskilab/flows/modules/JaBbA_simplified
JaBbA 2019-01-20 12:21:24: ALT field format like BND
JaBbA 2019-01-20 12:21:24: Read in 83 total junctions
JaBbA 2019-01-20 12:21:24: Read in coverage data across 26,625 bins and 2 chromosomes
JaBbA 2019-01-20 12:21:24: No segmentation provided, so performing segmentation using CBS
JaBbA 2019-01-20 12:21:27: Segmentation finished
JaBbA 2019-01-20 12:21:28: 240 segments produced
JaBbA 2019-01-20 12:21:28: Found tier field enforcing >=1 CN at 0 junctions
JaBbA 2019-01-20 12:21:28: Removing 0 tier 3 junctions
JaBbA 2019-01-20 12:21:28: Did not find nseg file! Ignore.
JaBbA 2019-01-20 12:21:29: Definining coverage good quality nodes as 90% bases covered by non-NA and non-Inf values in +/-100KB region
JaBbA 2019-01-20 12:21:29: Hard setting 8.61001 Mb of the genome to NA that didn't pass our quality threshold
JaBbA 2019-01-20 12:21:29: Using loess to fit mean to variance relationship in segments with greater than 20 bins
JaBbA 2019-01-20 12:21:29: Using ppgrid to estimate purity ploidy
JaBbA 2019-01-20 12:21:29: setting up ppgrid matrices ..
.....................................................................................................
.....................................................................................................
JaBbA 2019-01-20 12:21:31: Built gGraph with 592 nodes, 570 edges, purity 0.95, and ploidy 3.82
JaBbA 2019-01-20 12:21:34: Brand new function for reciprocal junctions calling.
.
JaBbA 2019-01-20 12:21:34: Excluding 0 aberrant junctions whose both breakpoints are in NA coverage regions
JaBbA 2019-01-20 12:21:34: Cancel nudge for 12 aberrant junctions where one of the 2 breakpoint is in NA coverage regions
JaBbA 2019-01-20 12:21:34: In sum, we are forcing 0 junctions, excluding 0 junctions, and nudging 0 junctions
JaBbA 2019-01-20 12:21:35: Creating ILOG CPLEX PARAMETER FILE in /ifs/scratch/c2b2/ac_lab/pw2470/devel/mskilab/flows/modules/JaBbA_simplified/jabba.raw.rds.prm
JaBbA 2019-01-20 12:21:36: Adjusting the kag (naive solution) as mipstart (initial solution).
JaBbA 2019-01-20 12:21:36: Applying mipstarts from previous jabba solution
JaBbA 2019-01-20 12:21:36: Fixing 106 nodes that are unmovable by slack
JaBbA 2019-01-20 12:21:36: Partitioned graph into 78 connected components with the size of the highest 10 components being:
104,50,46,38,32,26,14,14,14,10
JaBbA 2019-01-20 12:21:37: Junction balancing subgraph 1 of 78 which has 102 nodes comprising 9.77 MB and 1 chromosomes, including chrs 19
JaBbA 2019-01-20 12:21:37: Starting initial run for subgraph 1, aiming epgap at 1e-04, within time limit of 360
JaBbA 2019-01-20 12:21:37: Applying L0 slack penalty
JaBbA 2019-01-20 12:21:37: Total mass on cn portion of objective function: 29639.027491637. Total mass on edge slack: 102000
JaBbA 2019-01-20 12:21:37: Running CPLEX with relative optimality gap threshold 1e-04
Error in names(x) <- c("xopt", "obj", "status", "extra", "epgap") :
'names' attribute [5] must be the same length as the vector [1]
Calls: suppressPackageStartupMessages ... FUN -> do.call -> jbaMIP -> Rcplex2 -> .canonicalize
In addition: There were 18 warnings (use warnings() to see them)
Error in .C("Rcplex_free", PACKAGE = "JaBbA") :
"Rcplex_free" not available for .C() for package "JaBbA"
Calls: suppressPackageStartupMessages ... lapply -> FUN -> do.call -> jbaMIP -> Rcplex2 -> .C
Execution halted
Hi Peter - thanks for pointing this out. indeed jba had drifted away from the updated JaBbA arguments ... please try now
I also updated so that empty DEFAULT_GENOME or DEFAULT_BSGENOME environment variable doesn't break ... it just infers from the provided vcf and coverage files.
If you want to set these variables to custom values - use a path or URL to a UCSC style .chrom.sizes file or related format
Let me know if that fixes
On Sun, Jan 20, 2019 at 11:27 AM Peter Waltman notifications@github.com wrote:
In an attempt to fix the issue, I modified the jba rscript so that it set opt$abu and opt$ra to opt$coverage and opt$junctions, respectively, but now, I'm getting a different error - that the junctions file doesn't contain any junctions (see below). However, I checked, and the test junctions file definitely contains breakpoints, so I'm at a bit of a loss at this point.
waltman@waltman-workstation:~/modules/JaBbA_simplified$ JaBbA/jba JaBbA/inst/extdata/junctions.vcf JaBbA/inst/extdata/coverage.txt
( ) (
\ ( ) ( _ ) | | _ _ | (_) )| |_ | (_) | _ | | /'_
)| <'| '\ | _ | ( )_| |( (_| || (_) )| |_) )| | | |
_/'`_,)(_/'(,/'() ()(Junction Balance Analysis)
JaBbA 2019-01-20 11:20:30: Located junction file JaBbA/inst/extdata/junctions.vcf JaBbA 2019-01-20 11:20:30: Located coverage file JaBbA/inst/extdata/coverage.txt JaBbA 2019-01-20 11:20:30: Loading packages ... JaBbA 2019-01-20 11:20:31: Starting analysis in /ifs/scratch/c2b2/ac_lab/pw2470/devel/mskilab/flows/modules/JaBbA_simplified JaBbA 2019-01-20 11:20:31: Read in 0 total junctions JaBbA 2019-01-20 11:20:31: Empty junction input. Will do just one round of optimization. JaBbA 2019-01-20 11:20:31: Importing seg from UCSC format Error: scanVcf: 'indexname' must be character(1) path: JaBbA/inst/extdata/junctions.vcf In addition: There were 16 warnings (use warnings() to see them) Execution halted
my fix:
opt = NULL; if (!is.null(args)) { opt = args$options opt$junctions = args$args[1] opt$coverage = args$args[2] if (!file.exists(opt$junctions)) { message('Did not find junction file ', opt$junctions) print_help(parseobj); stop() } if (!file.exists(opt$coverage)) { message('Did not find coverage file ', opt$coverage) print_help(parseobj); stop() } if ( is.null( opt$abu ) ) opt$abu <- opt$coverage if ( is.null( opt$ra ) ) opt$abu <- opt$junctions } else { stop("See usage.") }
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mskilab/JaBbA/issues/18#issuecomment-455880557, or mute the thread https://github.com/notifications/unsubscribe-auth/AL-0kWMoCItxtE8b70O2xC8O9kBPWeceks5vFJj4gaJpZM4aJt5H .
@pwaltman Hi Peter, I've cleaned up the argument list in the jba
script. Please try again and the names should be slightly more intuitive now. Thanks for bringing this up.
Great, I'll try the new one shortly, but there was one more fix that I had to make to get it running. the vcf(s) that I'm passing in come from svaba, so they use the bam file as the sample identifier in the vcf, and this causes the call to JaBbA() to fail because the default argument for the sample name (opt$name) is set 'tumor'. I've come up with a workaround of editting the jba rscript to allow a 3rd argument to be passed to jba, which is the bam/sample name; and it seems to be working, for the moment at least (it's currently doing the CBS calculation).
module load R/4.2.1
jba ~/JaBbA-master/inst/extdata/junctions.vcf ~/JaBbA-master/inst/extdata/coverage.txt
I still got the same error when I just run test data, how should fix it? Thanks!
Starting optimization with CPLEX!
Error in names(x) <- c("xopt", "obj", "status", "extra", "epgap") :
'names' attribute [5] must be the same length as the vector [1]
Calls: suppressPackageStartupMessages ... ramip_stub -> jbaLP -> balance ->
Have you forgotten to check in the correct/updated version of the jba executable?
As a test of the install, I installed JaBbA into a new module that I've called "JaBbA_simplified", but it doesn't work with the test data that you provide. Here's the output from that test: