Closed pwaltman closed 5 years ago
Yes, this is a known issue. It originates from the previous default behaviour of gUtils::hg_seqlengths()
. We've changed that so now it will respect the actual seqnames of the GRanges object, e.g. if your input coverage has "chr", it will keep it. Please check if your gUtils::hg_seqlengths
default parameter value for chr
is TRUE.
Meanwhile, Trent in the group is gonna publish a formal R package fragCounter. I'll redirect this issue to that repo.
Ok, the issue is that fragCount has already stripped the 'chr' from the names. I guess I can re-add them back in, and re-save the rds files - although that's a pain in the neck.
Not sure if I should put this here or in the flows repo, but fragCounter automatically strips out 'chr' from the chromosome names of the regions in the final cov.rds file that it produces. This causes issues if one aligns their samples to Broad's hg38 genome, which does include the 'chr' in the names - as the vcf's produced by any breakpoint caller will use those chr-prefixed chromosomes.
As a result, JaBbA strips out all of the breakpoints because it 'thinks' that they fall in regions with NA coverage. If you're going to standardize on using the non-'chr'-prefixed chromosome names, maybe JaBbA should automatically make that adjustment to the the breakpoints it reads in (?). Since it looks like the Broad has no standardized on using hg38 for all of their tools in GATK v.4.x, this will increasingly be an issue.