Then, the code in the line referenced above will only populate the from/to columns until chr22, afterwards, they will be populated with NA, as so:
chrom k bps N from to
1: chr21 7 49 7974 1 49
2: chr21 7 71 7974 50 71
3: chr21 7 74 7974 72 74
4: chr21 7 88 7974 75 88
5: chr21 7 91 7974 89 91
6: chr21 7 204 7974 92 204
7: chr21 7 233 7974 205 233
8: chr22 1 254 8208 NA NA
9: chrX 55 6 14385 NA NA
10: chrX 55 25 14385 NA NA
11: chrX 55 27 14385 NA NA
12: chrX 55 33 14385 NA NA
13: chrX 55 35 14385 NA NA
14: chrX 55 39 14385 NA NA
15: chrX 55 47 14385 NA NA
16: chrX 55 49 14385 NA NA
17: chrX 55 52 14385 NA NA
18: chrX 55 83 14385 NA NA
19: chrX 55 85 14385 NA NA
20: chrX 55 86 14385 NA NA
21: chrX 55 87 14385 NA NA
22: chrX 55 161 14385 NA NA
23: chrX 55 172 14385 NA NA
24: chrX 55 174 14385 NA NA
25: chrX 55 176 14385 NA NA
26: chrX 55 212 14385 NA NA
27: chrX 55 213 14385 NA NA
28: chrX 55 215 14385 NA NA
29: chrX 55 224 14385 NA NA
30: chrX 55 230 14385 NA NA
31: chrX 55 231 14385 NA NA
32: chrX 55 244 14385 NA NA
33: chrX 55 247 14385 NA NA
34: chrX 55 261 14385 NA NA
35: chrX 55 268 14385 NA NA
36: chrX 55 365 14385 NA NA
37: chrX 55 366 14385 NA NA
38: chrX 55 373 14385 NA NA
39: chrX 55 400 14385 NA NA
40: chrX 55 401 14385 NA NA
41: chrX 55 409 14385 NA NA
42: chrX 55 410 14385 NA NA
43: chrX 55 443 14385 NA NA
44: chrX 55 454 14385 NA NA
45: chrX 55 455 14385 NA NA
46: chrX 55 556 14385 NA NA
47: chrX 55 557 14385 NA NA
48: chrX 55 641 14385 NA NA
49: chrX 55 642 14385 NA NA
50: chrX 55 645 14385 NA NA
51: chrX 55 685 14385 NA NA
52: chrX 55 700 14385 NA NA
53: chrX 55 701 14385 NA NA
54: chrX 55 702 14385 NA NA
55: chrX 55 703 14385 NA NA
56: chrX 55 705 14385 NA NA
57: chrX 55 713 14385 NA NA
58: chrX 55 718 14385 NA NA
59: chrX 55 725 14385 NA NA
60: chrX 55 747 14385 NA NA
61: chrX 55 748 14385 NA NA
62: chrX 55 749 14385 NA NA
63: chrX 55 780 14385 NA NA
chrom k bps N from to
This is because bps[1:(.N - 1)] will return bps[1:0] for chr22, as so:
> seg[chrom=="chr22", .(from = c(1, bps[1:(.N-1)] + 1), to = bps)]
from to
1: 1 254
2: 255 254
Proposed fix is to replace bps[1:(.N - 1)] with bps[-length(bps)]:
> seg[chrom=="chr22", .(from = c(1, bps[-length(bps)] + 1), to = bps)]
from to
1: 1 254
Contact Details
petr.smirnov@embl.de
What happened?
In the following line: https://github.com/friendsofstrandseq/mosaicatcher-pipeline/blob/3422078530b95fb2b403bfa22d164f5892bedff1/workflow/scripts/plotting/plot-sv-calls_dev.R#L375
Suppose you have a seg object where a chromosome has a single segment, for example (truncated to remove the chromosomes not affected by issue):
Then, the code in the line referenced above will only populate the from/to columns until chr22, afterwards, they will be populated with NA, as so:
This is because
bps[1:(.N - 1)]
will returnbps[1:0]
for chr22, as so:Proposed fix is to replace
bps[1:(.N - 1)]
withbps[-length(bps)]
:Steps To Reproduce
I am using R 4.2.2
Load data.table package
Run this to create example object:
Mosaicatcher-pipeline Version
1.5.1 (Default)
Command used
How did you run the pipeline?
Conda only
What did you use to run the pipeline? (local execution, HPC, cloud)
EMBL HPC
Pipeline configuration file
Relevant log output
Anything else?
No response