marbl / CHM13

The complete sequence of a human genome
Other
882 stars 96 forks source link

HG38 to CHM13v2 VCF liftover using chain #61

Closed jamesdalg closed 1 year ago

jamesdalg commented 2 years ago

I'm having some issues lifting over using the provided chain files. GATK complains that it can't find the reference sequence, yet I'm not precisely sure why. If you know of a workaround, let me know. I'm using GATK version 4.2.4.1, java 17.0.2, and picard 2.26.9.

(base) [dalgleishjl@cn0848 ~]$ ls -l /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa
-rw-r-----. 1 dalgleishjl CCRBioinfo 26337875 May 10 12:09 /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa

(base) [dalgleishjl@cn0848 ~]$  java -jar $PICARD_JAR LiftoverVcf      I=/fdb/GATK_resource_bundle/hg38/Homo_sapiens_assembly38.known_indels.vcf.gz      O=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.vcf CHAIN=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/hg38-chm13v2.over.chain      R=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa      REJECT=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.NOT_LIFTED_OVER.vcf;
INFO    2022-05-19 18:59:59     LiftoverVcf

********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    LiftoverVcf -I /fdb/GATK_resource_bundle/hg38/Homo_sapiens_assembly38.known_indels.vcf.gz -O /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.vcf -CHAIN /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/hg38-chm13v2.over.chain -R /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa -REJECT /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.NOT_LIFTED_OVER.vcf
**********

19:00:00.005 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/apps/picard/2.26.9/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu May 19 19:00:00 EDT 2022] LiftoverVcf INPUT=/fdb/GATK_resource_bundle/hg38/Homo_sapiens_assembly38.known_indels.vcf.gz OUTPUT=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.vcf CHAIN=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/hg38-chm13v2.over.chain REJECT=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/CHM13v2.0.known_indels.NOT_LIFTED_OVER.vcf REFERENCE_SEQUENCE=/data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa    WARN_ON_MISSING_CONTIG=false LOG_FAILED_INTERVALS=true WRITE_ORIGINAL_POSITION=false WRITE_ORIGINAL_ALLELES=false LIFTOVER_MIN_MATCH=1.0 ALLOW_MISSING_FIELDS_IN_HEADER=false RECOVER_SWAPPED_REF_ALT=false TAGS_TO_REVERSE=[AF] TAGS_TO_DROP=[MAX_AF] DISABLE_SORT=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu May 19 19:00:00 EDT 2022] Executing as dalgleishjl@cn0848 on Linux 3.10.0-862.14.4.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 17.0.2+8-LTS-86; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.26.9
INFO    2022-05-19 19:00:01     LiftoverVcf     Loading up the target reference genome.
[Thu May 19 19:00:02 EDT 2022] picard.vcf.LiftoverVcf done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=754974720
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Reference sequence (1) not found in /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa
        at htsjdk.samtools.reference.ReferenceSequenceFileWalker.get(ReferenceSequenceFileWalker.java:104)
        at picard.vcf.LiftoverVcf.doWork(LiftoverVcf.java:334)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:308)
        at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
        at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
(base) [dalgleishjl@cn0848 ~]$ ls -l /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa
-rw-r-----. 1 dalgleishjl CCRBioinfo 26337875 May 10 12:09 /data/CCRBioinfo/dalgleishjl/sv_mapping/chm13_ref/chm13v2.0.fa
jamesdalg commented 2 years ago

Also, if there is an alternative set of tools that will do the same job, I'm open to that as well.

diekhans commented 2 years ago

You might try crossmap: http://crossmap.sourceforge.net/

I haven't used it, but have heard good things

Manoswini-02 commented 1 year ago

@jamesdalg have you checked the chromosome notation of all the given files? They have to be same for proper mapping. It looks like one of your input files might have chromosome notation as numerals. I tried running the same and in my case, it ran without any error but nothing has lifted over and the reject file is huge.

jamesdalg commented 1 year ago

@Manoswini-02 Maybe you have a different version of that file. I don't see that issue with the file I have on our HPC cluster. See below:

-bash-4.2$ gzcat /fdb/GATK_resource_bundle/hg38/Homo_sapiens_assembly38.known_indels.vcf.gz | head -n 1000000 | tail -n 100
chr8    110098448   1369907 C   CGT .   PASS    set=variant2
chr8    110098724   .   ATAT    A   131.63  PASS    set=variant2
chr8    110100058   rs112172286 ATTAT   A   12833.07    PASS    AC=198;AF=0.09;AFR_AF=0.28;AMR_AF=0.04;AN=2184;ASN_AF=0.07;AVGPOST=0.9966;ERATE=0.0005;EUR_AF=0.01;LDAF=0.0900;RSQ=0.9867;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110101681   .   CAAAT   C   168 PASS    AC=20;AF=0.01;AFR_AF=0.04;AMR_AF=0.0028;AN=2184;AVGPOST=0.9978;ERATE=0.0003;LDAF=0.0098;RSQ=0.9192;THETA=0.0002;VT=INDEL;set=variant
chr8    110102567   .   AG  A   21  PASS    AC=83;AF=0.04;AFR_AF=0.09;AMR_AF=0.02;AN=2184;ASN_AF=0.03;AVGPOST=0.9334;ERATE=0.0096;EUR_AF=0.02;LDAF=0.0574;RSQ=0.5363;THETA=0.0003;VT=INDEL;set=variant
chr8    110102568   rs67482162,661578   GT  G   19273.73    PASS    AC=851;AF=0.39;AFR_AF=0.52;AMR_AF=0.31;AN=2184;ASN_AF=0.37;AVGPOST=0.9823;ERATE=0.0032;EUR_AF=0.36;LDAF=0.3907;RSQ=0.9727;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110105330   .   A   AG  435 PASS    AC=893;AF=0.41;AFR_AF=0.63;AMR_AF=0.33;AN=2184;ASN_AF=0.36;AVGPOST=0.9550;ERATE=0.0070;EUR_AF=0.34;LDAF=0.4066;RSQ=0.9346;THETA=0.0005;VT=INDEL;set=variant
chr8    110106700   rs36082058  AC  A   8238.66 PASS    AC=223;AF=0.10;AFR_AF=0.13;AMR_AF=0.07;AN=2184;ASN_AF=0.07;AVGPOST=0.9939;ERATE=0.0004;EUR_AF=0.12;LDAF=0.1036;RSQ=0.9766;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110107119   .   T   TA  7026.05 PASS    AC=218;AF=0.10;AFR_AF=0.13;AMR_AF=0.07;AN=2184;ASN_AF=0.07;AVGPOST=0.9966;ERATE=0.0006;EUR_AF=0.12;LDAF=0.1010;RSQ=0.9859;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110108673   1128681,rs35750301  GATAA   G   8771.01 PASS    AC=222;AF=0.10;AFR_AF=0.13;AMR_AF=0.07;AN=2184;ASN_AF=0.08;AVGPOST=0.9945;ERATE=0.0005;EUR_AF=0.12;LDAF=0.1020;RSQ=0.9774;THETA=0.0005;VT=INDEL;set=Intersection
chr8    110114056   rs35662462  GA  G   6866.94 PASS    AC=218;AF=0.10;AFR_AF=0.13;AMR_AF=0.07;AN=2184;ASN_AF=0.07;AVGPOST=0.9973;ERATE=0.0005;EUR_AF=0.12;LDAF=0.1003;RSQ=0.9875;THETA=0.0005;VT=INDEL;set=Intersection
chr8    110114070   .   GA  G   252 PASS    AC=19;AF=0.01;AFR_AF=0.03;AMR_AF=0.01;AN=2184;ASN_AF=0.0035;AVGPOST=0.9968;ERATE=0.0007;LDAF=0.0090;RSQ=0.8543;THETA=0.0003;VT=INDEL;set=variant
chr8    110117987   .   CATA    C   19.76   PASS    set=variant2
chr8    110118318   .   AATAT   A   222 PASS    AC=17;AF=0.01;AMR_AF=0.02;AN=2184;AVGPOST=0.9952;ERATE=0.0006;EUR_AF=0.01;LDAF=0.0096;RSQ=0.8156;THETA=0.0005;VT=INDEL;set=variant
chr8    110118687   rs34199770  G   GTCT    7136.95 PASS    AC=66;AF=0.03;AFR_AF=0.01;AMR_AF=0.02;AN=2184;AVGPOST=0.9937;ERATE=0.0004;EUR_AF=0.07;LDAF=0.0317;RSQ=0.9279;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110120839   .   A   AC  740.32  PASS    AC=40;AF=0.02;AFR_AF=0.04;AMR_AF=0.02;AN=2184;ASN_AF=0.0035;AVGPOST=0.9720;ERATE=0.0019;EUR_AF=0.01;LDAF=0.0285;RSQ=0.6255;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110125952   .   AT  A   8   PASS    AC=36;AF=0.02;AFR_AF=0.01;AMR_AF=0.02;AN=2184;ASN_AF=0.03;AVGPOST=0.9574;ERATE=0.0088;EUR_AF=0.01;LDAF=0.0321;RSQ=0.4829;THETA=0.0002;VT=INDEL;set=variant
chr8    110127029   1227906,rs5893975   AT  A   17832.88    PASS    AC=911;AF=0.42;AFR_AF=0.34;AMR_AF=0.40;AN=2184;ASN_AF=0.65;AVGPOST=0.9785;ERATE=0.0025;EUR_AF=0.29;LDAF=0.4168;RSQ=0.9673;THETA=0.0001;VT=INDEL;set=Intersection
chr8    110127566   rs112898888 T   TC  16971.33    PASS    AC=292;AF=0.13;AFR_AF=0.33;AMR_AF=0.15;AN=2184;ASN_AF=0.10;AVGPOST=0.9951;ERATE=0.0010;EUR_AF=0.02;LDAF=0.1356;RSQ=0.9852;THETA=0.0002;VT=INDEL;set=Intersection
chr8    110132120   1274441 C   CT  .   PASS    set=variant2
chr8    110132971   .   TA  T   454 PASS    AC=28;AF=0.01;AFR_AF=0.05;AMR_AF=0.0028;AN=2184;ASN_AF=0.0035;AVGPOST=0.9984;ERATE=0.0004;LDAF=0.0134;RSQ=0.9507;THETA=0.0003;VT=INDEL;set=variant
chr8    110133722   1193121,rs35672083  CAGGA   C   29815.27    PASS    AC=699;AF=0.32;AFR_AF=0.27;AMR_AF=0.30;AN=2184;ASN_AF=0.45;AVGPOST=0.9880;ERATE=0.0021;EUR_AF=0.26;LDAF=0.3223;RSQ=0.9777;THETA=0.0002;VT=INDEL;set=Intersection
chr8    110134760   .   C   CA  444.27  PASS    AC=13;AF=0.01;AFR_AF=0.02;AN=2184;AVGPOST=0.9926;ERATE=0.0018;EUR_AF=0.0026;LDAF=0.0085;RSQ=0.6369;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110138012   .   GTA G   201 PASS    AC=152;AF=0.07;AFR_AF=0.23;AMR_AF=0.02;AN=2184;ASN_AF=0.02;AVGPOST=0.9043;ERATE=0.0129;EUR_AF=0.03;LDAF=0.1054;RSQ=0.6186;THETA=0.0006;VT=INDEL;set=variant
chr8    110138036   .   GTA G   250 PASS    AC=159;AF=0.07;AFR_AF=0.07;AMR_AF=0.14;AN=2184;ASN_AF=0.11;AVGPOST=0.9757;ERATE=0.0015;EUR_AF=0.01;LDAF=0.0793;RSQ=0.8760;THETA=0.0002;VT=INDEL;set=variant
chr8    110140524   .   GA  G   88  PASS    AC=19;AF=0.01;AFR_AF=0.04;AMR_AF=0.0028;AN=2184;AVGPOST=0.9980;ERATE=0.0005;LDAF=0.0097;RSQ=0.9180;THETA=0.0009;VT=INDEL;set=variant
chr8    110144443   .   TTGAG   T   367 PASS    AC=20;AF=0.01;AFR_AF=0.0041;AMR_AF=0.04;AN=2184;ASN_AF=0.01;AVGPOST=0.9954;ERATE=0.0013;LDAF=0.0110;RSQ=0.8121;THETA=0.0007;VT=INDEL;set=variant
chr8    110144828   .   CAGATAAT    C   1801.08 PASS    AC=17;AF=0.01;AFR_AF=0.03;AN=2184;AVGPOST=0.9990;ERATE=0.0003;LDAF=0.0082;RSQ=0.9469;THETA=0.0006;VT=INDEL;set=Intersection
chr8    110147656   .   T   TC  269 PASS    AC=18;AF=0.01;AFR_AF=0.04;AN=2184;AVGPOST=0.9994;ERATE=0.0004;LDAF=0.0085;RSQ=0.9657;THETA=0.0005;VT=INDEL;set=variant
chr8    110148751   rs34297710,1127151  C   CTG 61139.92    PASS    AC=1104;AF=0.51;AFR_AF=0.81;AMR_AF=0.48;AN=2184;ASN_AF=0.55;AVGPOST=0.9836;ERATE=0.0003;EUR_AF=0.29;LDAF=0.5052;RSQ=0.9796;THETA=0.0005;VT=INDEL;set=Intersection
chr8    110148752   .   T   TGG 490 PASS    AC=1099;AF=0.50;AFR_AF=0.80;AMR_AF=0.48;AN=2184;ASN_AF=0.55;AVGPOST=0.9768;ERATE=0.0010;EUR_AF=0.28;LDAF=0.5009;RSQ=0.9710;THETA=0.0005;VT=INDEL;set=variant
chr8    110148753   .   G   GT  161 PASS    AC=1034;AF=0.47;AFR_AF=0.72;AMR_AF=0.47;AN=2184;ASN_AF=0.53;AVGPOST=0.9204;ERATE=0.0170;EUR_AF=0.27;LDAF=0.4590;RSQ=0.8924;THETA=0.0005;VT=INDEL;set=variant
chr8    110152972   .   A   AAGC    474 PASS    AC=90;AF=0.04;AFR_AF=0.17;AMR_AF=0.0028;AN=2184;AVGPOST=0.9910;ERATE=0.0006;EUR_AF=0.0040;LDAF=0.0443;RSQ=0.9203;THETA=0.0004;VT=INDEL;set=variant
chr8    110155683   rs35101014  AT  A   2563.64 PASS    AC=61;AF=0.03;AFR_AF=0.11;AMR_AF=0.01;AN=2184;AVGPOST=0.9970;ERATE=0.0004;LDAF=0.0293;RSQ=0.9660;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110159346   1597651,rs34432806  AT  A   12911.12    PASS    AC=788;AF=0.36;AFR_AF=0.41;AMR_AF=0.31;AN=2184;ASN_AF=0.46;AVGPOST=0.9879;ERATE=0.0020;EUR_AF=0.28;LDAF=0.3605;RSQ=0.9794;THETA=0.0011;VT=INDEL;set=Intersection
chr8    110159715   .   A   AT  158.66  PASS    set=variant2
chr8    110162734   .   CAT C   222.73  PASS    set=variant2
chr8    110162988   .   AAC A   251 PASS    AC=29;AF=0.01;AFR_AF=0.0020;AMR_AF=0.02;AN=2184;AVGPOST=0.9829;ERATE=0.0006;EUR_AF=0.03;LDAF=0.0178;RSQ=0.6422;THETA=0.0004;VT=INDEL;set=variant
chr8    110164145   .   TATTTATTC   T   311 PASS    AC=415;AF=0.19;AFR_AF=0.43;AMR_AF=0.18;AN=2184;ASN_AF=0.14;AVGPOST=0.8112;ERATE=0.0452;EUR_AF=0.08;LDAF=0.2412;RSQ=0.5769;THETA=0.0003;VT=INDEL;set=variant
chr8    110164149   .   TATTC   T   430 PASS    AC=434;AF=0.20;AFR_AF=0.47;AMR_AF=0.21;AN=2184;ASN_AF=0.11;AVGPOST=0.8430;ERATE=0.0217;EUR_AF=0.08;LDAF=0.2349;RSQ=0.6815;THETA=0.0007;VT=INDEL;set=variant
chr8    110174424   rs113388839 TC  T   2738.96 PASS    AC=86;AF=0.04;AFR_AF=0.10;AMR_AF=0.02;AN=2184;ASN_AF=0.01;AVGPOST=0.9966;ERATE=0.0007;EUR_AF=0.03;LDAF=0.0402;RSQ=0.9626;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110177292   .   A   AT  85.23   PASS    set=variant2
chr8    110177965   rs35012924  A   AT  9907.11 PASS    AC=309;AF=0.14;AFR_AF=0.07;AMR_AF=0.21;AN=2184;ASN_AF=0.10;AVGPOST=0.9177;ERATE=0.0172;EUR_AF=0.19;LDAF=0.1642;RSQ=0.7571;THETA=0.0041;VT=INDEL;set=Intersection
chr8    110182294   .   ATG A   245 PASS    AC=44;AF=0.02;AFR_AF=0.09;AMR_AF=0.0028;AN=2184;AVGPOST=0.9966;ERATE=0.0006;LDAF=0.0212;RSQ=0.9307;THETA=0.0010;VT=INDEL;set=variant
chr8    110184025   .   G   GT  81  PASS    AC=18;AF=0.01;AFR_AF=0.01;AMR_AF=0.01;AN=2184;ASN_AF=0.01;AVGPOST=0.9794;ERATE=0.0047;EUR_AF=0.01;LDAF=0.0171;RSQ=0.4794;THETA=0.0003;VT=INDEL;set=variant
chr8    110192203   .   G   GA  686.48  PASS    AC=22;AF=0.01;AFR_AF=0.04;AN=2184;ASN_AF=0.0017;AVGPOST=0.9949;ERATE=0.0018;LDAF=0.0123;RSQ=0.8337;THETA=0.0009;VT=INDEL;set=Intersection
chr8    110194169   1142787 GTA G   .   PASS    set=variant2
chr8    110197198   .   AACTT   A   309.79  PASS    set=variant2
chr8    110198229   647641  GA  G   .   PASS    set=variant2
chr8    110201861   .   G   GT  271 PASS    set=variant2
chr8    110202147   .   A   AC  528.04  PASS    set=variant2
chr8    110203490   1078990 CAT C   .   PASS    set=variant2
chr8    110204558   rs72441511  CCAAA   C   61707.21    PASS    AC=1161;AF=0.53;AFR_AF=0.30;AMR_AF=0.48;AN=2184;ASN_AF=0.63;AVGPOST=0.9847;ERATE=0.0040;EUR_AF=0.64;LDAF=0.5287;RSQ=0.9772;THETA=0.0003;VT=INDEL;set=Intersection
chr8    110206791   .   G   GTGCCCAACATC    9980.74 PASS    AC=48;AF=0.02;AFR_AF=0.09;AMR_AF=0.01;AN=2184;AVGPOST=0.9986;ERATE=0.0003;LDAF=0.0217;RSQ=0.9723;THETA=0.0002;VT=INDEL;set=Intersection
chr8    110209216   .   TTTAG   T   816.15  PASS    set=variant2
chr8    110209773   rs58189285  GTATAAACTCATTCT G   2429.78 PASS    AC=28;AF=0.01;AFR_AF=0.05;AMR_AF=0.0028;AN=2184;AVGPOST=0.9940;ERATE=0.0004;LDAF=0.0135;RSQ=0.8381;THETA=0.0008;VT=INDEL;set=Intersection
chr8    110213818   rs112811473 G   GT  20491.12    PASS    AC=405;AF=0.19;AFR_AF=0.40;AMR_AF=0.11;AN=2184;ASN_AF=0.21;AVGPOST=0.9949;ERATE=0.0006;EUR_AF=0.07;LDAF=0.1853;RSQ=0.9875;THETA=0.0002;VT=INDEL;set=Intersection
chr8    110215236   .   T   TC  217 PASS    AC=35;AF=0.02;AN=2184;ASN_AF=0.06;AVGPOST=0.9944;ERATE=0.0005;EUR_AF=0.0040;LDAF=0.0178;RSQ=0.8745;THETA=0.0007;VT=INDEL;set=variant
chr8    110215285   .   CTA C   141.73  PASS    set=variant2
chr8    110216060   .   CAAATT  C   1428.37 PASS    AC=30;AF=0.01;AFR_AF=0.06;AMR_AF=0.0028;AN=2184;AVGPOST=0.9938;ERATE=0.0005;LDAF=0.0146;RSQ=0.8417;THETA=0.0017;VT=INDEL;set=Intersection
chr8    110216081   .   GA  G   1557.49 PASS    AC=27;AF=0.01;AFR_AF=0.0041;AMR_AF=0.01;AN=2184;AVGPOST=0.9958;ERATE=0.0006;EUR_AF=0.03;LDAF=0.0139;RSQ=0.8706;THETA=0.0007;VT=INDEL;set=Intersection
chr8    110217315   .   G   GA  2280.52 PASS    AC=33;AF=0.02;AMR_AF=0.07;AN=2184;ASN_AF=0.01;AVGPOST=0.9957;ERATE=0.0005;LDAF=0.0168;RSQ=0.8931;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110218286   .   A   AAT 140 PASS    AC=12;AF=0.01;AFR_AF=0.02;AMR_AF=0.0028;AN=2184;AVGPOST=0.9932;ERATE=0.0020;EUR_AF=0.0013;LDAF=0.0085;RSQ=0.6537;THETA=0.0003;VT=INDEL;set=variant
chr8    110219024   1156547 ATGTGTGTG   A,ATG,ATGTGTGTGTG   .   PASS    set=variant2
chr8    110220439   214372  AGT A   .   PASS    set=variant2
chr8    110221301   1196048 A   AT  .   PASS    set=variant2
chr8    110222110   .   ACT A   479.59  PASS    set=variant2
chr8    110223185   .   G   GT  1216.41 PASS    AC=32;AF=0.01;AFR_AF=0.06;AMR_AF=0.01;AN=2184;AVGPOST=0.9989;ERATE=0.0005;LDAF=0.0152;RSQ=0.9671;THETA=0.0006;VT=INDEL;set=Intersection
chr8    110223475   .   G   GCTAT   4171.39 PASS    AC=46;AF=0.02;AFR_AF=0.09;AMR_AF=0.01;AN=2184;ASN_AF=0.0017;AVGPOST=0.9977;ERATE=0.0004;LDAF=0.0218;RSQ=0.9523;THETA=0.0006;VT=INDEL;set=Intersection
chr8    110226345   rs3070867,1208383   AAAAT   A   39201.35    PASS    AC=1157;AF=0.53;AFR_AF=0.29;AMR_AF=0.48;AN=2184;ASN_AF=0.62;AVGPOST=0.9789;ERATE=0.0026;EUR_AF=0.64;LDAF=0.5290;RSQ=0.9673;THETA=0.0015;VT=INDEL;set=Intersection
chr8    110227220   .   TA  T   6949.27 PASS    AC=104;AF=0.05;AFR_AF=0.21;AMR_AF=0.01;AN=2184;AVGPOST=0.9983;ERATE=0.0004;LDAF=0.0478;RSQ=0.9860;THETA=0.0010;VT=INDEL;set=Intersection
chr8    110227880   .   T   TA  1091.34 PASS    AC=15;AF=0.01;AN=2184;ASN_AF=0.03;AVGPOST=0.9986;ERATE=0.0004;LDAF=0.0074;RSQ=0.9274;THETA=0.0007;VT=INDEL;set=Intersection
chr8    110229788   .   CA  C   2041.22 PASS    AC=29;AF=0.01;AFR_AF=0.05;AMR_AF=0.01;AN=2184;AVGPOST=0.9984;ERATE=0.0004;LDAF=0.0137;RSQ=0.9544;THETA=0.0008;VT=INDEL;set=Intersection
chr8    110234279   .   T   TA  186.97  PASS    set=variant2
chr8    110234387   134382  CTATCTA C,CTATCTATATCTA .   PASS    set=variant2
chr8    110234427   155920  C   CTA .   PASS    set=variant2
chr8    110241287   .   CAAT    C   177.50  PASS    set=variant2
chr8    110243202   rs10554903  ATTTG   A   391.74  PASS    AC=11;AF=0.01;AFR_AF=0.02;AN=2184;AVGPOST=0.9995;ERATE=0.0003;LDAF=0.0051;RSQ=0.9585;THETA=0.0008;VT=INDEL;set=Intersection
chr8    110243269   .   CTGTT   C   305 PASS    AC=11;AF=0.01;AFR_AF=0.01;AMR_AF=0.02;AN=2184;AVGPOST=0.9995;ERATE=0.0004;EUR_AF=0.0013;LDAF=0.0053;RSQ=0.9543;THETA=0.0003;VT=INDEL;set=variant
chr8    110244271   .   T   TA  565.50  PASS    set=variant2
chr8    110244857   .   ATAGT   A   213.25  PASS    set=variant2
chr8    110246344   .   C   CT  362.86  PASS    set=variant2
chr8    110247756   rs35903315  TTAGA   T   23042.33    PASS    AC=282;AF=0.13;AFR_AF=0.02;AMR_AF=0.22;AN=2184;ASN_AF=0.09;AVGPOST=0.9858;ERATE=0.0005;EUR_AF=0.18;LDAF=0.1312;RSQ=0.9541;THETA=0.0006;VT=INDEL;set=Intersection
chr8    110248217   .   CATCATATATAT    C   645 PASS    AC=143;AF=0.07;AFR_AF=0.0041;AMR_AF=0.09;AN=2184;AVGPOST=0.9771;ERATE=0.0010;EUR_AF=0.14;LDAF=0.0702;RSQ=0.8636;THETA=0.0014;VT=INDEL;set=variant
chr8    110249494   1613673 G   GA  .   PASS    set=variant2
chr8    110252848   .   T   TATATGTTG   198 PASS    AC=22;AF=0.01;AMR_AF=0.01;AN=2184;AVGPOST=0.9949;ERATE=0.0004;EUR_AF=0.02;LDAF=0.0108;RSQ=0.8180;THETA=0.0010;VT=INDEL;set=variant
chr8    110252862   .   C   CAT 381 PASS    AC=1613;AF=0.74;AFR_AF=0.94;AMR_AF=0.59;AN=2184;ASN_AF=0.78;AVGPOST=0.9384;ERATE=0.0018;EUR_AF=0.64;LDAF=0.7279;RSQ=0.8989;THETA=0.0008;VT=INDEL;set=variant
chr8    110252863   1286860 A   ATAT    .   PASS    set=variant2
chr8    110252865   .   A   ATT 470 PASS    AC=1561;AF=0.71;AFR_AF=0.88;AMR_AF=0.57;AN=2184;ASN_AF=0.77;AVGPOST=0.9095;ERATE=0.0094;EUR_AF=0.63;LDAF=0.7008;RSQ=0.8541;THETA=0.0010;VT=INDEL;set=variant
chr8    110257510   .   AAT A   1226.81 PASS    AC=23;AF=0.01;AFR_AF=0.05;AN=2184;AVGPOST=0.9980;ERATE=0.0004;LDAF=0.0107;RSQ=0.9191;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110258735   .   TTAA    T   334 PASS    AC=26;AF=0.01;AMR_AF=0.01;AN=2184;AVGPOST=0.9974;ERATE=0.0003;EUR_AF=0.03;LDAF=0.0125;RSQ=0.9170;THETA=0.0008;VT=INDEL;set=variant
chr8    110263236   1286846,rs3044032   T   TTTCC   72090.63    PASS    AC=1610;AF=0.74;AFR_AF=0.88;AMR_AF=0.59;AN=2184;ASN_AF=0.82;AVGPOST=0.9869;ERATE=0.0011;EUR_AF=0.65;LDAF=0.7353;RSQ=0.9750;THETA=0.0007;VT=INDEL;set=Intersection
chr8    110264233   rs67360064  TTATATC T   62997.03    PASS    AC=1111;AF=0.51;AFR_AF=0.32;AMR_AF=0.46;AN=2184;ASN_AF=0.60;AVGPOST=0.9906;ERATE=0.0009;EUR_AF=0.59;LDAF=0.5082;RSQ=0.9854;THETA=0.0004;VT=INDEL;set=Intersection
chr8    110266836   rs3044031   G   GACCA   21807.13    PASS    AC=242;AF=0.11;AFR_AF=0.03;AMR_AF=0.19;AN=2184;ASN_AF=0.09;AVGPOST=0.9864;ERATE=0.0006;EUR_AF=0.14;LDAF=0.1097;RSQ=0.9482;THETA=0.0008;VT=INDEL;set=Intersection
chr8    110266956   .   TC  T   172 PASS    AC=14;AF=0.01;AN=2184;ASN_AF=0.02;AVGPOST=0.9987;ERATE=0.0003;LDAF=0.0067;RSQ=0.9215;THETA=0.0006;VT=INDEL;set=variant
chr8    110269140   .   TATAAACC    T   460 PASS    AC=28;AF=0.01;AFR_AF=0.05;AMR_AF=0.0028;AN=2184;AVGPOST=0.9981;ERATE=0.0003;LDAF=0.0133;RSQ=0.9449;THETA=0.0004;VT=INDEL;set=variant
chr8    110269149   .   AAAAACCTT   A   478 PASS    AC=28;AF=0.01;AFR_AF=0.05;AMR_AF=0.0028;AN=2184;AVGPOST=0.9980;ERATE=0.0004;LDAF=0.0133;RSQ=0.9440;THETA=0.0007;VT=INDEL;set=variant
chr8    110272074   .   TA  T   2317.84 PASS    AC=62;AF=0.03;AFR_AF=0.12;AMR_AF=0.01;AN=2184;AVGPOST=0.9924;ERATE=0.0005;EUR_AF=0.0013;LDAF=0.0301;RSQ=0.9082;THETA=0.0005;VT=INDEL;set=Intersection
chr8    110277147   2750650 AGT A   .   PASS    set=variant2
jamesdalg commented 1 year ago

Looks like this is fixed.

download the latest chains here: https://hgwdev.gi.ucsc.edu/~markd/t2t/CHM13-fixed-chains/

diekhans commented 1 year ago

this should still be fixed in S3 .. sorry for wasting your time.!