lima1 / PureCN

Copy number calling and variant classification using targeted short read sequencing
https://bioconductor.org/packages/devel/bioc/html/PureCN.html
Artistic License 2.0
127 stars 32 forks source link

Error in correctCoverageBias #73

Closed kdkorthauer closed 4 years ago

kdkorthauer commented 5 years ago

Hi,

I'm getting this error (see below) when running the correctCoverageBias function. I tried modifying the correctCoverageBias.R source code in line 257 (in the internal function .correctCoverageBiasLoess, where the 'rough' loess prediction is carried out) to increase the span (from 0.03 to 0.10 for example) as the error message suggests and this fixes the issue.

Is this span something that the user should be able to modify through an argument of correctCoverageBias?

I got the same results with both the Bioc release and Github master versions of PureCN.

Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)),  : 
  NA/NaN/Inf in foreign function call (arg 5)
In addition: Warning messages:
1: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  zero-width neighborhood. make span bigger
2: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  zero-width neighborhood. make span bigger
3: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  pseudoinverse used at 0.39735
4: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  neighborhood radius 8.4904e-05
5: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  reciprocal condition number  0
6: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  zero-width neighborhood. make span bigger
7: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  zero-width neighborhood. make span bigger
8: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  zero-width neighborhood. make span bigger
9: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  There are other near singularities as well. 4.8025e-09

Best, Keegan

lima1 commented 5 years ago

Hi Keegan,

Thanks for report. Are you able to share the coverage file and the interval file for debugging purposes (by email if you prefer)? I've never seen this error before.

Thanks, Markus

kdkorthauer commented 5 years ago

Hi Markus,

I've placed example coverage and interval files on Dropbox - here is a reprex that shows the results I'm getting. Let me know if you need any more detail on how I generated the coverage and interval files by following the steps in the PureCN vignette.

Best, Keegan

library(PureCN)
#> Loading required package: DNAcopy
#> Loading required package: VariantAnnotation
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#> 
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, parApply, parCapply, parLapply,
#>     parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind,
#>     colMeans, colnames, colSums, dirname, do.call, duplicated,
#>     eval, evalq, Filter, Find, get, grep, grepl, intersect,
#>     is.unsorted, lapply, Map, mapply, match, mget, order, paste,
#>     pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce,
#>     rowMeans, rownames, rowSums, sapply, setdiff, sort, table,
#>     tapply, union, unique, unsplit, which, which.max, which.min
#> Loading required package: GenomeInfoDb
#> Loading required package: S4Vectors
#> Loading required package: stats4
#> 
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:base':
#> 
#>     expand.grid
#> Loading required package: IRanges
#> Loading required package: GenomicRanges
#> Loading required package: SummarizedExperiment
#> Loading required package: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: DelayedArray
#> Loading required package: matrixStats
#> 
#> Attaching package: 'matrixStats'
#> The following objects are masked from 'package:Biobase':
#> 
#>     anyMissing, rowMedians
#> Loading required package: BiocParallel
#> 
#> Attaching package: 'DelayedArray'
#> The following objects are masked from 'package:matrixStats':
#> 
#>     colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
#> The following objects are masked from 'package:base':
#> 
#>     aperm, apply, rowsum
#> Loading required package: Rsamtools
#> Loading required package: Biostrings
#> Loading required package: XVector
#> 
#> Attaching package: 'Biostrings'
#> The following object is masked from 'package:DelayedArray':
#> 
#>     type
#> The following object is masked from 'package:base':
#> 
#>     strsplit
#> 
#> Attaching package: 'VariantAnnotation'
#> The following object is masked from 'package:base':
#> 
#>     tabulate
#> Registered S3 methods overwritten by 'ggplot2':
#>   method         from 
#>   [.quosures     rlang
#>   c.quosures     rlang
#>   print.quosures rlang

download.file(url = "https://www.dropbox.com/s/81sk8icrgb3wicg/coverage.txt?dl=1",
              destfile = "coverage.txt")

download.file(url = "https://www.dropbox.com/s/876k0a9bov733ty/intervals.txt?dl=1",
              destfile = "intervals.txt")

correctCoverageBias(coverage.file = "coverage.txt", 
                    interval.file = "intervals.txt")
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : zero-width neighborhood. make span bigger

#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : zero-width neighborhood. make span bigger
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : pseudoinverse used at 0.39735
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : neighborhood radius 8.4904e-05
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : reciprocal condition number 0
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : zero-width neighborhood. make span bigger

#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : zero-width neighborhood. make span bigger

#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : zero-width neighborhood. make span bigger
#> Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
#> parametric, : There are other near singularities as well. 4.8025e-09
#> Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), : NA/NaN/Inf in foreign function call (arg 5)

sessionInfo()
#> R Under development (unstable) (2019-02-04 r76055)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Mojave 10.14.2
#> 
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#>  [1] PureCN_1.13.18              VariantAnnotation_1.29.19  
#>  [3] Rsamtools_1.99.2            Biostrings_2.51.2          
#>  [5] XVector_0.23.0              SummarizedExperiment_1.13.0
#>  [7] DelayedArray_0.9.8          BiocParallel_1.17.14       
#>  [9] matrixStats_0.54.0          Biobase_2.43.1             
#> [11] GenomicRanges_1.35.1        GenomeInfoDb_1.19.2        
#> [13] IRanges_2.17.4              S4Vectors_0.21.10          
#> [15] BiocGenerics_0.29.1         DNAcopy_1.57.0             
#> 
#> loaded via a namespace (and not attached):
#>  [1] httr_1.4.0               VGAM_1.1-1              
#>  [3] bit64_0.9-7              splines_3.6.0           
#>  [5] assertthat_0.2.0         highr_0.7               
#>  [7] blob_1.1.1               BSgenome_1.51.0         
#>  [9] GenomeInfoDbData_1.2.0   yaml_2.2.0              
#> [11] progress_1.2.0           pillar_1.3.1            
#> [13] RSQLite_2.1.1            lattice_0.20-38         
#> [15] glue_1.3.0               digest_0.6.18           
#> [17] RColorBrewer_1.1-2       colorspace_1.4-0        
#> [19] htmltools_0.3.6          Matrix_1.2-15           
#> [21] plyr_1.8.4               XML_3.98-1.17           
#> [23] pkgconfig_2.0.2          biomaRt_2.39.2          
#> [25] zlibbioc_1.29.0          purrr_0.3.0             
#> [27] scales_1.0.0             tibble_2.0.1            
#> [29] ggplot2_3.1.0            GenomicFeatures_1.35.7  
#> [31] lazyeval_0.2.1           magrittr_1.5            
#> [33] crayon_1.3.4             memoise_1.1.0           
#> [35] evaluate_0.13            tools_3.6.0             
#> [37] data.table_1.12.0        prettyunits_1.0.2       
#> [39] hms_0.4.2                formatR_1.5             
#> [41] stringr_1.4.0            Rhdf5lib_1.5.1          
#> [43] munsell_0.5.0            AnnotationDbi_1.45.0    
#> [45] lambda.r_1.2.3           compiler_3.6.0          
#> [47] rlang_0.3.1              rhdf5_2.27.12           
#> [49] futile.logger_1.4.3      grid_3.6.0              
#> [51] RCurl_1.95-4.11          bitops_1.0-6            
#> [53] rmarkdown_1.11           gtable_0.2.0            
#> [55] DBI_1.0.0                R6_2.4.0                
#> [57] gridExtra_2.3            GenomicAlignments_1.19.1
#> [59] knitr_1.21               dplyr_0.8.0.1           
#> [61] rtracklayer_1.43.1       bit_1.1-14              
#> [63] futile.options_1.0.1     stringi_1.3.1           
#> [65] Rcpp_1.0.0               tidyselect_0.2.5        
#> [67] xfun_0.4
BiocManager::valid()
#> [1] TRUE

Created on 2019-02-20 by the reprex package (v0.2.1)

lima1 commented 5 years ago

Thanks, that's perfect. If you don't mind, can you try generating the interval file with the IntervalFile.R script as described in the Quick start vignette with the latest GitHub (https://bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.pdf)? It probably won't change this particular problem, but there are a few more sanity checks that might point to the source of the problem.

kdkorthauer commented 5 years ago

Hi Markus,

I tried generating the interval file using the IntervalFile.R script instead as you suggest (taking care to try and use the same options). I still get the error in predLoess as before.

Best, Keegan

lima1 commented 5 years ago

Thanks Keegan. I'll try to figure out how to make this more robust. Is this happening with all or most of your samples? The one you sent me doesn't look like a catastrophic QC failure though.

kdkorthauer commented 5 years ago

Happy to help. It is happening on all 6 of my samples in this particular dataset.

Best, Keegan

lima1 commented 5 years ago

Hi Keegan,

I think this happens because thousands of regions have exactly the same GC content around 0.4. I haven't done the math if this is expected with shorter baits of 78bp. I don't see this in my baits files, but these are Agilent baits and significantly longer.

I'm fairly sure it does, but the genome version of the baits file matches the reference Fasta file? Can you say more about the assay, does this maybe contain baits pre-selected for GC content?

If you add min.target.width = 100 or 120 to preprocessIntervals (or --mintargetwidth in IntervalFile.R), does it work? This will automatically increase the size of the intervals to that value.

Best, Markus

kdkorthauer commented 5 years ago

Hi Markus,

Interesting. These are a set of Illumina baits for hybrid capture used by the Broad. The protocol is one of their standards for whole exome sequencing, and I'm not aware if the baits were pre-selected for GC content. Yes, I am also using a reference genome version that matches the baits.

Setting min.target.width = 100 in preprocessIntervals seems to have done the trick - I've only tested so far on one sample. I'll go ahead and run the rest to see if the error occurs.

Should increasing the targets to that minimum width impact the downstream analysis of CN/purity in any way?

Thanks for looking into this so quickly!

Best, Keegan

lima1 commented 5 years ago

Great, so the issue is that the grid of available GC content values is too coarse for the default span value. With longer baits, there are of course more possible GC contents, but there is also a higher chance that probes overlap and then are merged. I'll figure out how to deal with this properly, thanks again for the report!

There shouldn't be an issue at all if you set it to 100. The average coverage will be slightly lower than what you expect, but should be minimal. You can check the output file _variants.csv (or the VCF). It marks all variants falling in flanking regions. The default is to include all SNVs in the 50bp flanking regions. If the coverage is low and you see lots of noisy SNPs in the flanking region, you can decrease the value to 30 or 40.

Make sure to get as many normal samples as possible and remove the mitochondria baits.

Let me know in case you run into more issues.

Markus

kdkorthauer commented 5 years ago

Good to know. I haven't seen the _variants.csv output file. Which function will produce this?

I'm not sure what you mean by get as many normal samples as possible. Is that regarding a tumor-only analysis? I am actually using matched normals for all of my samples.

Thanks for the reminder to remove MT baits! I'll let you know if I run into any more issues with these baits.

Best, Keegan

lima1 commented 5 years ago

_variants.csv will be generated by PureCN.R as described in the Quick vignette.

It's better to normalize coverage against a pool of normals than against matched normals. Checkout the GATK4 somatic CNA workflow, we do pretty much the same thing. They have some nice figures in their tutorial comparing normalization by matched normals vs pools. So I would recommend using all the normals you have and generate the pool (following the Quick vignette). Don't provide the matched normal coverage as normal.

If you have matched normals and call variants with Mutect in matched normal mode, you don't necessarily need the mapping bias steps in the Quick vignette (--normal_panel in NormalDB.R).

lima1 commented 5 years ago

Closing now, will likely set default of min.target.width to 100. Version 1.15.3 fixes a bug in min.target.width: if probes overlap after resizing, they were not merged.

Shenglai commented 5 years ago

Hi @lima1 Sorry for dig this issue out, but I've observed that after setting min.target.width to 100, my final normalDB.rds and interval weight file are greatly larger. I'm wondering if I should concern about it and if it would affect downstream performance.

Shenglai commented 5 years ago

For example,

I have a normalDB without specifying the min.target.width, created by purecn 1.11.11

77.0 MiB normalDB_CPTAC-3-38_nexterarapidcapture_exome_targetedregions_v1.2_hg38.rds
7.5 MiB interval_weights_nexterarapidcapture_exome_targetedregions_v1.2_hg38.txt

However, when I use purecn 1.14.3, I somehow have to set the value to 100 to get the coverage. And now I have

1.3 GiB nexterarapidcaptureexomev1.2.cptac3.hg38.normalDB.rds
153.5 MiB nexterarapidcaptureexomev1.2.cptac3.hg38.interval_weights.txt
lima1 commented 5 years ago

Hi @Shenglai,

yes, you should be concerned about this.

There might be a mixup of PureCN versions? The change of default in min.target.width happened in version 1.16. There was a bug in versions < 1.14.2 that resulted in overlapping regions not being merged after the resizing. That will increase the number of regions in the coverage files.

Are you sure that --mintargetwidth 100 in 1.14.3 causes this increase? Is it maybe possible that some files were generated with 1.11.11? If yes, can you send the output of preprocessIntervals (or IntervalFIle.R)?

Best, Markus

Shenglai commented 5 years ago

I'm pretty confident that all processes were done in 1.14.3. The interval output from IntervalFIle.R is too large to upload here.

-rw-r--r--  1 Li  staff    18M Nov 12 16:38 interval_without_mintargetwidth.txt
-rw-r--r--  1 Li  staff   248M Nov 12 16:32 intervals_with_mintargetwidth.txt

Let me know how I could submit these intervals to you.

Here's the log from each run.

With --mintargetwidth 100:

INFO [2019-11-12 22:18:53] Loading /var/lib/cwl/stg10bc6071-1022-4bf9-842d-85872a2ef387/GRCh38.d1.vd1.100.bigWig...
INFO [2019-11-12 22:22:23] Loading PureCN 1.14.3...
INFO [2019-11-12 22:22:23] Processing /var/lib/cwl/stg28a6ab9b-6257-4f26-94d9-14998b8de05b/NexteraRapidCapture_Exome_Probes_v1.2.hg38.b
ed...
WARN [2019-11-12 22:22:32] Found small target regions (< 100bp). Will resize them.
INFO [2019-11-12 22:23:17] Splitting 19504 large targets to an average width of 400.
INFO [2019-11-12 22:23:36] Tiling off-target regions to an average width of 200000.
INFO [2019-11-12 22:23:37] Removing following contigs from off-target regions: chr4_GL000008v2_random,chr14_GL000009v2_random,chr14_GL0
00225v1_random,chr15_KI270727v1_random,chr16_KI270728v1_random,chr17_KI270729v1_random,chrUn_KI270442v1,chrUn_KI270743v1,CMV
INFO [2019-11-12 22:26:15] Removing 501105 intervals with low mappability score (<0.50).
WARN [2019-11-12 22:26:16] No reptiming scores provided.
INFO [2019-11-12 22:26:16] Calculating GC-content...
Loading required package: TxDb.Hsapiens.UCSC.hg38.knownGene
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: org.Hs.eg.db

WARN [2019-11-12 22:31:38] Attempted adding gene symbols to intervals. Heuristics have been used to pick symbols for overlapping genes.
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr1_GL383518v1_alt, chr1_GL383519v1_alt, chr1_GL383520v2_alt, chr1_KI270759v1_alt, chr1_KI270760v1_alt, chr1_KI270761v1_al
t, chr1_KI270762v1_alt, chr1_KI270763v1_alt, chr1_KI270764v1_alt, chr1_KI270765v1_alt, chr1_KI270766v1_alt, chr1_KI270892v1_alt, chr1_K
N196472v1_fix, chr1_KN196473v1_fix, chr1_KN196474v1_fix, chr1_KN538360v1_fix, chr1_KN538361v1_fix, chr1_KQ031383v1_fix, chr1_KQ458382v1
_alt, chr1_KQ458383v1_alt, chr1_KQ458384v1_alt, chr1_KQ983255v1_alt, chr1_KV880763v1_alt, chr1_KZ208904v1_alt, chr1_KZ208905v1_alt, chr
1_KZ208906v1_fix, chr1_KZ559100v1_fix, chr2_GL383521v1_alt, chr2_GL383522v1_alt, chr2_GL582966v2_alt, chr2_KI270767v1_alt, chr2_KI27076
8v1_alt, chr2_KI270769v1_alt, chr2_KI270770v1_alt, chr2_KI270771v1_alt, chr2_KI270772v1_alt, chr2_KI270773v1_alt, chr2_KI270774v1_alt,
chr2_KI270775v1_alt, chr2_KI270776v1_alt, chr2_KI270893v1_alt, chr2_KI270894v1_alt, chr2_KN538362v1_fix, chr2_KN538363v1_ [... truncate
d]
2: In .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr1_GL383518v1_alt, chr1_GL383519v1_alt, chr1_GL383520v2_alt, chr1_KI270759v1_alt, chr1_KI270760v1_alt, chr1_KI270761v1_al
t, chr1_KI270762v1_alt, chr1_KI270763v1_alt, chr1_KI270764v1_alt, chr1_KI270765v1_alt, chr1_KI270766v1_alt, chr1_KI270892v1_alt, chr1_K
N196472v1_fix, chr1_KN196473v1_fix, chr1_KN196474v1_fix, chr1_KN538360v1_fix, chr1_KN538361v1_fix, chr1_KQ031383v1_fix, chr1_KQ458382v1
_alt, chr1_KQ458383v1_alt, chr1_KQ458384v1_alt, chr1_KQ983255v1_alt, chr1_KV880763v1_alt, chr1_KZ208904v1_alt, chr1_KZ208905v1_alt, chr
1_KZ208906v1_fix, chr1_KZ559100v1_fix, chr2_GL383521v1_alt, chr2_GL383522v1_alt, chr2_GL582966v2_alt, chr2_KI270767v1_alt, chr2_KI27076
8v1_alt, chr2_KI270769v1_alt, chr2_KI270770v1_alt, chr2_KI270771v1_alt, chr2_KI270772v1_alt, chr2_KI270773v1_alt, chr2_KI270774v1_alt,
chr2_KI270775v1_alt, chr2_KI270776v1_alt, chr2_KI270893v1_alt, chr2_KI270894v1_alt, chr2_KN538362v1_fix, chr2_KN538363v1_ [... truncate
d]
3: In .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': CMV, HBV, HCV-1, HCV-2, HIV-1, HIV-2, HPV-mCG2, HPV-mCG3, HPV-mCH2, HPV-mFD1, HPV-mFD2, HPV-mFS1, HPV-mFi864, HPV-mKC5, HPV
-mKN1, HPV-mKN2, HPV-mKN3, HPV-mL55, HPV-mRTRX7, HPV-mSD2, HPV1, HPV10, HPV100, HPV101, HPV102, HPV103, HPV104, HPV105, HPV106, HPV107,
 HPV108, HPV109, HPV11, HPV110, HPV111, HPV112, HPV113, HPV114, HPV115, HPV116, HPV117, HPV118, HPV119, HPV12, HPV120, HPV121, HPV122,
HPV123, HPV124, HPV125, HPV126, HPV127, HPV128, HPV129, HPV13, HPV130, HPV131, HPV132, HPV133, HPV134, HPV135, HPV136, HPV137, HPV138,
HPV139, HPV14, HPV140, HPV141, HPV142, HPV143, HPV144, HPV145, HPV146, HPV147, HPV148, HPV149, HPV15, HPV150, HPV151, HPV152, HPV153, H
PV154, HPV155, HPV156, HPV159, HPV16, HPV160, HPV161, HPV162, HPV163, HPV164, HPV165, HPV166, HPV167, HPV168, HPV169, HPV17, HPV170, HP
V171, HPV172, HPV173, HPV174, HPV175, HPV178, HPV179, HPV18, HPV180, HPV184, HPV19, HPV197, HPV199, HPV2, HPV20, HPV21, H [... truncate
d]
4: In .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': CMV, HBV, HCV-1, HCV-2, HIV-1, HIV-2, HPV-mCG2, HPV-mCG3, HPV-mCH2, HPV-mFD1, HPV-mFD2, HPV-mFS1, HPV-mFi864, HPV-mKC5, HPV
-mKN1, HPV-mKN2, HPV-mKN3, HPV-mL55, HPV-mRTRX7, HPV-mSD2, HPV1, HPV10, HPV100, HPV101, HPV102, HPV103, HPV104, HPV105, HPV106, HPV107,
 HPV108, HPV109, HPV11, HPV110, HPV111, HPV112, HPV113, HPV114, HPV115, HPV116, HPV117, HPV118, HPV119, HPV12, HPV120, HPV121, HPV122,
HPV123, HPV124, HPV125, HPV126, HPV127, HPV128, HPV129, HPV13, HPV130, HPV131, HPV132, HPV133, HPV134, HPV135, HPV136, HPV137, HPV138,
HPV139, HPV14, HPV140, HPV141, HPV142, HPV143, HPV144, HPV145, HPV146, HPV147, HPV148, HPV149, HPV15, HPV150, HPV151, HPV152, HPV153, H
PV154, HPV155, HPV156, HPV159, HPV16, HPV160, HPV161, HPV162, HPV163, HPV164, HPV165, HPV166, HPV167, HPV168, HPV169, HPV17, HPV170, HP
V171, HPV172, HPV173, HPV174, HPV175, HPV178, HPV179, HPV18, HPV180, HPV184, HPV19, HPV197, HPV199, HPV2, HPV20, HPV21, H [... truncate
d]

With out --mintargetwidth parameter:

INFO [2019-11-12 22:35:52] Loading /var/lib/cwl/stg6a7184c9-de94-4e48-9e64-b8277dd03b8d/GRCh38.d1.vd1.100.bigWig...
INFO [2019-11-12 22:36:51] Loading PureCN 1.14.3...
INFO [2019-11-12 22:36:51] Processing /var/lib/cwl/stg3b969460-5542-49f1-a168-1b59b9231130/NexteraRapidCapture_Exome_Probes_v1.2.hg38.b
ed...
WARN [2019-11-12 22:37:03] Intervals contain off-target regions. Will not change intervals.
WARN [2019-11-12 22:37:30] No reptiming scores provided.
INFO [2019-11-12 22:37:30] Calculating GC-content...
Loading required package: TxDb.Hsapiens.UCSC.hg38.knownGene
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: org.Hs.eg.db

WARN [2019-11-12 22:38:47] Attempted adding gene symbols to intervals. Heuristics have been used to pick symbols for overlapping genes.
lima1 commented 5 years ago

Not good. Can you share NexteraRapidCapture_Exome_Probes_v1.2.hg38.bed by email? I should be able to reproduce. Sorry for the inconvenience.

Shenglai commented 5 years ago

Sure. Thanks for the help.

Shenglai

On Nov 12, 2019, at 17:25, M. Riester notifications@github.com wrote:

Not good. Can you share NexteraRapidCapture_Exome_Probes_v1.2.hg38.bed by email? I should be able to reproduce. Sorry for the inconvenience.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lima1/PureCN/issues/73?email_source=notifications&email_token=ACAHTSZA54KUMDMUDZVYPI3QTM3OVA5CNFSM4GYXBNR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED4J6GQ#issuecomment-553164570, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAHTS2JVMMERFF7COG3UBDQTM3OVANCNFSM4GYXBNRQ.

Shenglai commented 5 years ago

Hmm, it seems there might be something wrong with my input. I should use the one that I just sent to you, but I did not. Let me check on my end to see if the error can be reproducible when I used the correct baits file. Sorry for the confusion.

lima1 commented 4 years ago

Closing again, feel free to reopen in case this wasn't an input file mixup. Also, if you think that should have been caught with warning/error message by checking input more thoroughly, feel free to open an issue with this suggested check.