Open Wennie-s opened 2 years ago
I see you are providing a .gz file, so first make sure that you have compiled the code with the install-RAiSD-ZLIB.sh script because the standard RAiSD version does not support gz files. If this is the case already, these first vcf lines you provided look fine. Can you provide here some more lines of the vcf file (header line and the first data line) to make sure the header is also there correctly?
On Tue, Nov 2, 2021 at 4:12 AM Wennie-s @.***> wrote:
Hey, I just run RaiSD and find the error: unexpected error during parser initialization. The running code is /data/user003/soft/RAiSD/raisd-master/bin/release/RAiSD -n Cugi_run -I /wtmp/user003/Reseq/SNP_filter/Chr_vcftools_filter/correct_site/Chr_revised/Chr1.final.vcf.gz -S Cugi_subgroup.txt -y 2 -w 100000 -f. the vcf title is:
fileformat=VCFv4.2
ALT=<ID=NON_REF,Description="Represents any possible alternative allele
not already represented at this location by REF and ALT">
FILTER=<ID=Filter,Description="DP < 5 || QD < 2.0 || MQ < 40.0 || FS >
60.0 || SOR > 3.0 || MQRankSum < - 12.5 || ReadPosRankSum < -8.0 || QUAL < 30">
FILTER=
FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the
ref and alt alleles in the order listed">
FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth
(reads with MQ=255 or with bad mates are filtered)">
FORMAT=
FORMAT=
FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed
within the GVCF block">
FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing
haplotype information, describing how the alternate alleles are phased in relation to one another; will always be heterozygous and is not intended to describe called alleles">
FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID
information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized,
Phred-scaled likelihoods for genotypes as defined in the VCF specification">
FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically
the position of the first variant in the set)">
FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional
reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component
statistics which comprise the Fisher's Exact Test to detect strand bias.">
GATKCommandLine=<ID=GenotypeGVCFs,CommandLine="GenotypeGVCFs --output
o.vcf.gz --variant KS-7.g.vcf.gz --intervals Chr1:1-10000 --reference SplitBig.Cugi.all.chr.fasta --include-non-variant-sites false --merge-input-intervals false --input-is-somatic false --tumor-lod-to-emit 3.5 --allele-fraction-error 0.001 --keep-combined-raw-annotations false --use-posteriors-to-calculate-qual false --dont-use-dragstr-priors false --use-new-qual-calculator true --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 30.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotype-assignment-method USE_PLS_TO_ASSIGN --genomicsdb-use-bcf-codec false --genomicsdb-shared-posixfs-optimizations false --only-output-calls-starting-in-intervals false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --disable-tool-default-annotations false --enable-all-annotations false --allow-old-rms-mapping-quality-annotation-data false",Version="4.2.0.0",Date="September 21, 2021 10:17:53 AM CST">
GATKCommandLine=<ID=SelectVariants,CommandLine="SelectVariants --output
/wtmp/user003/Reseq/Chr_gatk_filter_extract/Chr2_2.extract.snp.vcf.gz --exclude-filtered true --variant Chr2_2.gatkfilter.vcf.gz --invertSelect false --exclude-non-variants false --preserve-alleles false --remove-unused-alternates false --restrict-alleles-to ALL --keep-original-ac false --keep-original-dp false --mendelian-violation false --invert-mendelian-violation false --mendelian-violation-qual-threshold 0.0 --select-random-fraction 0.0 --remove-fraction-genotypes 0.0 --fully-decode false --max-indel-size 2147483647 --min-indel-size 0 --max-filtered-genotypes 2147483647 --min-filtered-genotypes 0 --max-fraction-filtered-genotypes 1.0 --min-fraction-filtered-genotypes 0.0 --max-nocall-number 2147483647 --max-nocall-fraction 1.0 --set-filtered-gt-to-nocall false --allow-nonoverlapping-command-line-samples false --suppress-reference-path false --genomicsdb-use-bcf-codec false --genomicsdb-shared-posixfs-optimizations false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.0.0",Date="October 8, 2021 9:55:48 AM CST">
INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in
genotypes, for each ALT allele, in the same order as listed">
INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each
ALT allele, in the same order as listed">
INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles
in called genotypes">
INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from
Wilcoxon rank sum test of Alt Vs. Ref base qualities">
INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth;
some reads may have been filtered">
INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the
interval">
INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value
for exact test of excess heterozygosity">
INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using
Fisher's exact test to detect strand bias">
INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding
coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood
expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood
expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
INFO=
INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From
Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality
by Depth">
INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of
squared MQ and total depth) for improved RMS Mapping Quality calculation. Incompatible with deprecated RAW_MQ formulation.">
INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from
Wilcoxon rank sum test of Alt vs. Ref read position bias">
INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of
2x2 contingency table to detect strand bias">
contig=
contig=
contig=
contig=
contig=
...... I have checked my vcf title, but can't find any problem. Can you help me?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/alachins/raisd/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALKWCTMFYKOTEQROIGZP2LUJ5JJZANCNFSM5HFNT5WA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Nikolaos Alachiotis
Hey, I just run RaiSD and find the error: unexpected error during parser initialization. The running code is /data/user003/soft/RAiSD/raisd-master/bin/release/RAiSD -n Cugi_run -I /wtmp/user003/Reseq/SNP_filter/Chr_vcftools_filter/correct_site/Chr_revised/Chr1.final.vcf.gz -S Cugi_subgroup.txt -y 2 -w 100000 -f. the vcf title is:
fileformat=VCFv4.2
ALT=
FILTER= 60.0 || SOR > 3.0 || MQRankSum < - 12.5 || ReadPosRankSum < -8.0 || QUAL < 30">
FILTER=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
FORMAT=
GATKCommandLine=<ID=GenotypeGVCFs,CommandLine="GenotypeGVCFs --output o.vcf.gz --variant KS-7.g.vcf.gz --intervals Chr1:1-10000 --reference SplitBig.Cugi.all.chr.fasta --include-non-variant-sites false --merge-input-intervals false --input-is-somatic false --tumor-lod-to-emit 3.5 --allele-fraction-error 0.001 --keep-combined-raw-annotations false --use-posteriors-to-calculate-qual false --dont-use-dragstr-priors false --use-new-qual-calculator true --annotate-with-num-discovered-alleles false --heterozygosity 0.001 --indel-heterozygosity 1.25E-4 --heterozygosity-stdev 0.01 --standard-min-confidence-threshold-for-calling 30.0 --max-alternate-alleles 6 --max-genotype-count 1024 --sample-ploidy 2 --num-reference-samples-if-no-call 0 --genotype-assignment-method USE_PLS_TO_ASSIGN --genomicsdb-use-bcf-codec false --genomicsdb-shared-posixfs-optimizations false --only-output-calls-starting-in-intervals false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --disable-tool-default-annotations false --enable-all-annotations false --allow-old-rms-mapping-quality-annotation-data false",Version="4.2.0.0",Date="September 21, 2021 10:17:53 AM CST">
GATKCommandLine=<ID=SelectVariants,CommandLine="SelectVariants --output /wtmp/user003/Reseq/Chr_gatk_filter_extract/Chr2_2.extract.snp.vcf.gz --exclude-filtered true --variant Chr2_2.gatkfilter.vcf.gz --invertSelect false --exclude-non-variants false --preserve-alleles false --remove-unused-alternates false --restrict-alleles-to ALL --keep-original-ac false --keep-original-dp false --mendelian-violation false --invert-mendelian-violation false --mendelian-violation-qual-threshold 0.0 --select-random-fraction 0.0 --remove-fraction-genotypes 0.0 --fully-decode false --max-indel-size 2147483647 --min-indel-size 0 --max-filtered-genotypes 2147483647 --min-filtered-genotypes 0 --max-fraction-filtered-genotypes 1.0 --min-fraction-filtered-genotypes 0.0 --max-nocall-number 2147483647 --max-nocall-fraction 1.0 --set-filtered-gt-to-nocall false --allow-nonoverlapping-command-line-samples false --suppress-reference-path false --genomicsdb-use-bcf-codec false --genomicsdb-shared-posixfs-optimizations false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.0.0",Date="October 8, 2021 9:55:48 AM CST">
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
INFO=
contig=
contig=
contig=
contig=
contig= ......
I have checked my vcf title, but can't find any problem. Can you help me?