kharchenkolab / numbat

Haplotype-aware CNV analysis from single-cell RNA-seq
https://kharchenkolab.github.io/numbat/
Other
158 stars 22 forks source link

Problem with pileup_and_phase.R on 10X Visium Spatial transcriptomics #95

Closed mjakobs closed 1 year ago

mjakobs commented 1 year ago

Hi team,

I started following your article on how to run numbat on spatial transcriptomics data but have encountered some issues in the pre-processing stage.

Code run:

ml apps/cellsnp-lite/.0.3.1
ml apps/eagle/.2.4.1
ml apps/samtools/1.15.1
ml apps/R/4.1.0

cd /scratch/wsspaces/gmjakobsdottir-numbat_testing-0/RB15_06_C006750T1PTa/

  Rscript /data/tog/gmjakobsdottir/R_libraries/x86_64-pc-linux-gnu-library/4.1/numbat/bin/pileup_and_phase.R \
--label S06 \
--samples RB15_06_C006750T1PTa \
--bams /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/possorted_genome_bam.bam \
--barcodes /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/filtered_feature_bc_matrix/barcodes.tsv \
--gmap /data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
--eagle /lmod/apps/eagle/2.4.1/bin/eagle \
--snpvcf /data/tog/gmjakobsdottir/numbat_references/genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz \
--paneldir /data/tog/gmjakobsdottir/numbat_references/1000G_hg38 \
--outdir . \
--ncores 4

Error output:

Epilog:  nodes=1:ppn=4
Epilog:  mem=8gb
Epilog:  neednodes=1:ppn=4

The following have been reloaded with a version change:
  1) compilers/gcc/9.2.0 => compilers/gcc/11.1.0

Warning messages:
1: package 'stringr' was built under R version 4.2.0 
2: package 'numbat' was built under R version 4.2.0 
Using genome version: hg38
[I::main] start time: 2023-01-13 13:49:24
[I::main] loading the VCF file for given SNPs ...
[I::main] fetching 7352497 candidate variants ...
[I::main] mode 1: fetch given SNPs in 1863 single cells.
[W::hts_idx_load3] [W::hts_idx_load3] The index file is older than the data file: /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/possorted_genome_bam.bam.bai[W::hts_idx_load3] The index file is older than the data file: /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/possorted_genome_bam.bam.bai

The index file is older than the data file: /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/possorted_genome_bam.bam.bai
[W::hts_idx_load3] The index file is older than the data file: /data/tog/gmjakobsdottir/Jacks_visium_data/RB15_06_C006750T1PTa/outs/possorted_genome_bam.bam.bai
[I::pileup_positions_with_fetch][Thread-2] 2.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 2.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 4.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 2.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 4.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 4.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 6.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 6.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 6.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 8.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 8.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 2.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 10.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 8.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 10.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 12.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 12.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 14.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 10.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 14.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 16.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 4.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 16.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 18.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 18.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 20.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 20.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 6.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 12.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 22.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 24.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 22.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 8.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 10.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 14.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 26.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 24.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 12.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 26.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 14.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 28.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 16.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 30.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 16.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 28.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 32.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 18.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 34.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 36.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 30.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 18.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 20.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 32.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 20.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 34.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 38.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 40.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 22.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 36.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 22.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 38.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 24.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 24.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 40.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 42.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 26.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 42.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 44.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 26.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 28.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 46.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 44.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 30.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 48.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 28.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 46.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 32.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 34.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 30.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 50.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 36.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 32.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 52.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 38.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 54.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 34.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 56.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 40.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 48.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 36.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 42.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 38.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 50.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 44.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 40.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 58.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 46.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 60.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 48.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 62.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 52.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 50.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 64.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 42.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 54.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 52.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 56.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 54.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 44.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 58.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 66.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 56.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 68.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 60.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 58.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 62.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 46.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 60.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 64.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 66.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 70.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 62.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 72.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 68.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 74.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 64.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 66.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 48.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 70.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 68.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 50.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 52.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 76.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 54.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 72.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 78.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 70.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 56.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 80.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 74.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 72.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 74.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 82.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 76.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 58.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 78.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 76.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 78.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 84.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 80.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 80.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 82.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 60.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 84.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 86.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 82.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 88.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 86.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 90.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 84.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 62.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 88.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 86.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 90.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 88.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 92.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 94.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 64.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 92.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 90.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 96.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 94.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 96.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 92.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 94.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-2] 98.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 96.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-1] 98.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-0] 98.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 66.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 68.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 70.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 72.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 74.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 76.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 78.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 80.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 82.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 84.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 86.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 88.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 90.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 92.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 94.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 96.00% SNPs processed.
[I::pileup_positions_with_fetch][Thread-3] 98.00% SNPs processed.
[I::main] All Done!
[I::main] end time: 2023-01-13 14:40:44
[I::main] time spent: 3080 seconds.
No SNPs left for chr2!
No SNPs left for chr4!
No SNPs left for chr5!
No SNPs left for chr12!
No SNPs left for chr13!
No SNPs left for chr18!
No SNPs left for chr20!
              --> WARNING: Check REF/ALT agreement between target and ref? <--
WARNING: Sample 1 (1-indexed) has a het count of 1
ERROR: Unable to open file: ./phasing/S06_chr2.vcf.gz

ERROR: Target and ref have too few matching SNPs (M = 1)
ERROR: Unable to open file: ./phasing/S06_chr4.vcf.gz
ERROR: Unable to open file: ./phasing/S06_chr5.vcf.gz
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 1)
              --> WARNING: Check REF/ALT agreement between target and ref? <--
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 1)
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 0)

ERROR: Target and ref have too few matching SNPs (M = 1)
              --> WARNING: Check REF/ALT agreement between target and ref? <--
WARNING: Sample 1 (1-indexed) has a het count of 1
ERROR: Unable to open file: ./phasing/S06_chr12.vcf.gz
ERROR: Unable to open file: ./phasing/S06_chr13.vcf.gz
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 1)
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 1)
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 0)
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 1)
ERROR: Unable to open file: ./phasing/S06_chr18.vcf.gz
              --> WARNING: Check REF/ALT agreement between target and ref? <--
ERROR: Unable to open file: ./phasing/S06_chr20.vcf.gz
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 0)
              --> WARNING: Check REF/ALT agreement between target and ref? <--

ERROR: Target and ref have too few matching SNPs (M = 0)
Error in FUN(X[[i]], ...) : Phased VCF not found
Calls: %>% -> mutate -> Reduce -> lapply -> FUN
Execution halted
Epilog:  nodes=1:ppn=4
Epilog:  mem=8gb
Epilog:  neednodes=1:ppn=4
Epilog:  cput=02:32:47
Epilog:  vmem=3490028kb
Epilog:  walltime=01:00:20
Epilog:  mem=920312kb
Epilog:  energy_used=0

Following the information in this comment it seems that the script is running correctly for most of the chromosomes as:

  1. The files in the output folder /pileup are not empty, and cellSNP.base.vcf is not empty.
  2. There are non-empty {sample}_chr*.vcf.gz files under /phasing, however, the files for the chromosomes listed in the error message (2, 4, 5, 12, 13, 18, 20) are not present.
  3. There are non-empty {sample}_chr*.phased.vcf.gz files under /phasing, however, the files for the chromosomes listed in the error message are not present.

I have tried this on a second sample from the same lot and received a similar error affecting chromosomes 2, 4, 5, 9, 10, 11, 12, 13, and 18.

Do you have any suggestions for how to proceed?

Thank you for your help! Maria

teng-gao commented 1 year ago

Hello,

What kind of Visium data is this (fresh frozen or FFPE probe-based chemistry)? How many SNPs do you have in cellSNP.base.vcf and in the chr-specific VCF files under /phasing? For the chromosomes that phasing worked, what was the phasing confidence (phasing.log)?

mjakobs commented 1 year ago

Hi Teng,

Thanks for following up on this. The data is based on FFPE, unfortunately.

cellSNP.base.vcf has 1670 SNPs, with all chromosomes represented.

The number of SNPs in the chr-specific VCF files are as follows:

The contents of the phasing.log file are:

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr1.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr1.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr1.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              5795045 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr2.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr2.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr2.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr3.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr3.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr3.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 2 SNPs in both target and reference

SNPs ignored: 0 SNPs in target but not reference
              5280534 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

Filling in genetic map coordinates using reference file:
  /data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz
Physical distance range: 89280766 base pairs
Genetic distance range:  84.7166 cM
Average # SNPs per cM:   0
Number of <=(64-SNP, 1cM) segments: 1
Average # SNPs per segment: 2

Time for reading input: 62.1244 sec

Fraction of heterozygous genotypes: 0.5
Typical span of default 100-het history length: 8471.66 cM
Setting --histFactor=1.00

Auto-selecting number of phasing iterations: setting --pbwtIters to 1

BEGINNING PHASING

PHASING ITER 1 OF 1

Phasing target samples
................................................................................
Time for phasing iter 1: 0.00697017
Writing vcf.gz output to ./phasing/S05_chr3.phased.vcf.gz
Time for writing output: 0.0200469
Total elapsed time for analysis = 62.1521 sec

Mean phase confidence of each target individual:
ID  PHASE_CONFIDENCE
S05 -nan
                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr4.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr4.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr4.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr5.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr5.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr5.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr6.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr6.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr6.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 1 SNPs in both target and reference

SNPs ignored: 0 SNPs in target but not reference
              4539754 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr7.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr7.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr7.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 1 SNPs in both target and reference

SNPs ignored: 3 SNPs in target but not reference
              4222929 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr8.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr8.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr8.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 1 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              4162375 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr9.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr9.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr9.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr10.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr10.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr10.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr11.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr11.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr11.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr12.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr12.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr12.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr13.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr13.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr13.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr14.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr14.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr14.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 4 SNPs in target but not reference
              2383125 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr15.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr15.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr15.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 1 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              2153932 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr16.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr16.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr16.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              2410531 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr17.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr17.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr17.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 2 SNPs in target but not reference
              2066683 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr18.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr18.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr18.phased 

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr19.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr19.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr19.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 2 SNPs in target but not reference
              1625698 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr20.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr20.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr20.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 1 SNPs in both target and reference

SNPs ignored: 0 SNPs in target but not reference
              1706441 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: 0

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr21.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr21.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr21.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              976599 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

                      +-----------------------------+
                      |                             |
                      |   Eagle v2.4.1              |
                      |   November 18, 2018         |
                      |   Po-Ru Loh                 |
                      |                             |
                      +-----------------------------+

Copyright (C) 2015-2018 Harvard University.
Distributed under the GNU GPLv3+ open source license.

Command line options:

/lmod/apps/eagle/2.4.1/bin/eagle \
    --numThreads 4 \
    --vcfTarget ./phasing/S05_chr22.vcf.gz \
    --vcfRef /data/tog/gmjakobsdottir/numbat_references/1000G_hg38/chr22.genotypes.bcf \
    --geneticMapFile=/data/tog/gmjakobsdottir/numbat_references/genetic_map_hg38_withX.txt.gz \
    --outPrefix ./phasing/S05_chr22.phased 

Setting number of threads to 4

Reference samples: Nref = 2548
Target samples: Ntarget = 1
SNPs to analyze: M = 0 SNPs in both target and reference

SNPs ignored: 1 SNPs in target but not reference
              993881 SNPs in reference but not target
              0 multi-allelic SNPs in target

Missing rate in target genotypes: -nan

Thanks for taking the time to look into this! Maria

teng-gao commented 1 year ago

Yeah,unfortunately Visium for FFPE is probe based and it doesn't actually sequence the transcripts (so no SNP information is captured).

mjakobs commented 1 year ago

Ah, of course. Thank you for your time!