broadinstitute / ichorCNA

Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing.
GNU General Public License v3.0
164 stars 88 forks source link

The style does not have a compatible entry for the species supported by Seqname #77

Open ury opened 4 years ago

ury commented 4 years ago

Hi, I'm trying to execute the following commands:

readCounter --chromosome chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX --window 1000000 --quality 20 test.bam > test.wig

Rscript /opt/bin/ichorCNA/scripts/runIchorCNA.R \
 --libdir /opt/bin/ichorCNA \
 --id test \
 --WIG test.wig \
 --gcWig /opt/bin/ichorCNA/inst/extdata/gc_hg38_1000kb.wig \
 --mapWig /opt/bin/ichorCNA/inst/extdata/map_hg38_1000kb.wig \
 --ploidy 2 \
 --normal "c(0.2,0.3,0.4,0.5,0.6)" \
 --normalPanel /opt/bin/ichorCNA/inst/extdata/HD_ULP_PoN_hg38_1Mb_median_normAutosome_median.rds \
 --maxCN 4 \
 --includeHOMD FALSE \
 --chrs "c(1:22)" \
 --chrTrain "c(1:22)" \
 --estimateNormal TRUE \
 --estimateScPrevalence TRUE \
 --scStates "c()" \
 --txnE 0.9999 \
 --txnStrength 10000 \
 --genomeStyle UCSC \
 --centromere /opt/bin/ichorCNA/inst/extdata/GRCh38.GCA_000001405.2_centromere_acen.txt \
 --outDir ./

I'm getting the following error:

Error in seqlevelsStyle(as.character(x)) :
  The style does not have a compatible entry for the species supported by
  Seqname. Please see genomeStyles() for supported species/style
Calls: loadReadCountsFromWig ... setGenomeStyle -> %in% -> seqlevelsStyle -> seqlevelsStyle

I'm using R version 3.6.3, ichorCNA v0.2.0. The BAM file is a human DNA aligned using GRCh38 + decoy.

Any idea?

gavinha commented 4 years ago

Hi @ury

This is unexpected and I wonder if it may be due to inclusion of some alt contigs/decoys in the WIG file. If this is still a problem for you, can you please print out all the lines in the test.wig file where the the line starts with fixed.

Thanks, Gavin

ury commented 4 years ago

Hi Gavin,

Unfortunately, the Docker used for my test is long gone and the recipe I used previously to set up the software doesn't work anymore.

I would be happy to try to reproduce the above, if you have any "official" or "semi-official" guide providing step-by-step instructions for getting from a vanilla Ubuntu (bionic/xenial) to a working ichorCNA environment (with any R version you'd recommend). I must say that none of the various instructions I tried to use from around the Web actually worked.

Thanks, Ury

blackbeerd commented 3 years ago

Hi Gavin, I am having the same issue:

Error in seqlevelsStyle(as.character(x)) : The style does not have a compatible entry for the species supported by Seqname. Please see genomeStyles() for supported species/style Calls: loadReadCountsFromWig ... setGenomeStyle -> %in% -> seqlevelsStyle -> seqlevelsStyle

I have used this software before with no issues so I went back to a previous wig file that had worked and received the same error message. I am using a miniconda environment and I'm tempted to reinstall and try again. Maybe before I do that I'll grep my wig fill for lines beginning with "fixed" as you suggested to @ury - do you have any other recommendations?

Thanks, Chris

gavinha commented 3 years ago

Hi @ury @blackbeerd

I think the problem is that you are using a slightly older version of ichorCNA. Can you please install the latest version in the repo (last updated Dec 2019).

In the version you are using, it calls setGenomeStyle which is deprecated: https://github.com/broadinstitute/ichorCNA/blob/5bfc03ed854f0e93fe5b624c97c1290fa0053837/R/utils.R#L70-L74

In the latest version, it uses the built-in seqlevelsStyle functions in the GenomeInfoDb package. https://github.com/broadinstitute/ichorCNA/blob/5bfc03ed854f0e93fe5b624c97c1290fa0053837/R/utils.R#L118-L121

Hope this helps and should also address Issue #82

Best, Gavin

> x = wigToGRanges("CRPC_554.ctDNA_ULP.bin500000.wig")
Slurping: CRPC_554.ctDNA_ULP.bin500000.wig
Parsing: fixedStep chrom=1 start=1 step=500000 span=500000
Parsing: fixedStep chrom=2 start=1 step=500000 span=500000
Parsing: fixedStep chrom=3 start=1 step=500000 span=500000
Parsing: fixedStep chrom=4 start=1 step=500000 span=500000
Parsing: fixedStep chrom=5 start=1 step=500000 span=500000
Parsing: fixedStep chrom=6 start=1 step=500000 span=500000
Parsing: fixedStep chrom=7 start=1 step=500000 span=500000
Parsing: fixedStep chrom=8 start=1 step=500000 span=500000
Parsing: fixedStep chrom=9 start=1 step=500000 span=500000
Parsing: fixedStep chrom=10 start=1 step=500000 span=500000
Parsing: fixedStep chrom=11 start=1 step=500000 span=500000
Parsing: fixedStep chrom=12 start=1 step=500000 span=500000
Parsing: fixedStep chrom=13 start=1 step=500000 span=500000
Parsing: fixedStep chrom=14 start=1 step=500000 span=500000
Parsing: fixedStep chrom=15 start=1 step=500000 span=500000
Parsing: fixedStep chrom=16 start=1 step=500000 span=500000
Parsing: fixedStep chrom=17 start=1 step=500000 span=500000
Parsing: fixedStep chrom=18 start=1 step=500000 span=500000
Parsing: fixedStep chrom=19 start=1 step=500000 span=500000
Parsing: fixedStep chrom=20 start=1 step=500000 span=500000
Parsing: fixedStep chrom=21 start=1 step=500000 span=500000
Parsing: fixedStep chrom=22 start=1 step=500000 span=500000
Parsing: fixedStep chrom=X start=1 step=500000 span=500000
Parsing: fixedStep chrom=Y start=1 step=500000 span=500000
Sorting by decreasing chromosome size
> seqlevelsStyle(x) <- "UCSC"
> x
GRanges object with 6206 ranges and 1 metadata column:
         seqnames            ranges strand |     value
            <Rle>         <IRanges>  <Rle> | <numeric>
     [1]     chr1          1-500000      * |       120
     [2]     chr1    500001-1000000      * |       617
     [3]     chr1   1000001-1500000      * |       946
     [4]     chr1   1500001-2000000      * |      1006
     [5]     chr1   2000001-2500000      * |       997
     ...      ...               ...    ... .       ...
  [6202]    chr21 46000001-46500000      * |      1155
  [6203]    chr21 46500001-47000000      * |      1142
  [6204]    chr21 47000001-47500000      * |      1218
  [6205]    chr21 47500001-48000000      * |      1044
  [6206]    chr21 48000001-48500000      * |       246
  -------
  seqinfo: 24 sequences from an unspecified genome; no seqlengths
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS/LAPACK: /app/software/OpenBLAS/0.3.7-GCC-8.3.0/lib/libopenblas_haswellp-r0.3.7.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] ichorCNA_0.3.2       devtools_2.3.2       usethis_1.6.3        GenomicRanges_1.38.0 GenomeInfoDb_1.22.1  IRanges_2.20.2       S4Vectors_0.24.4     BiocGenerics_0.32.0
 [9] HMMcopy_1.28.1       data.table_1.12.8
blackbeerd commented 3 years ago

Thanks, Gaven. Yes, I was able to get an older version of ichorCNA to run with R v3.4 in a miniconda env without the getGenomeStyle issues coming up. I'll try the links you provided for running the latest version in R v4.

Best, Chris