edg1983 / GREEN-VARAN

Annotate non-coding regulatory vars using our GREEN-DB, prediction scores, conservation and pop AF
MIT License
18 stars 6 forks source link

workflow green_varan failed #8

Closed moon9319 closed 4 months ago

moon9319 commented 2 years ago

Dear Edoardo, I want to use ​​​​​​GREEN-VARAN workflow​ to prioritize variants. I tried to GREEN-VARAN work flow on test vcf file "GRCh38.test.smallvars.tmp.vcf.gz".

/home/moon9319/nextflow /home/moon9319/GREEN-VARAN/workflow/main.nf \ -profile local \ --input $DataPath$VCF_FILE \ --build GRCh38 \ --out /home/moon9319/SNV/02.WORKFLOW/ \ --scores best \ --regions best \ --AF \ --greenvaran_config /home/moon9319/GREEN-VARAN/config/prioritize_smallvars.json \ --greenvaran_dbschema /home/moon9319/GREEN-VARAN/config/greendb_schema_v2.5.json

############################

executor > local (11) [99/1943af] process > WRITE_SCORE_TOML (2) [100%] 3 of 3 ✔ [f7/37a2bf] process > WRITE_REGION_TOML (2) [100%] 3 of 3 ✔ [69/207667] process > WRITE_AF_TOML (1) [100%] 1 of 1 ✔ [c6/ec0a86] process > concat_toml [100%] 1 of 1 ✔ [ea/e32acb] process > ANNOTATE:annotate_vcf (1) [100%] 1 of 1 ✔ [5f/8128e9] process > ANNOTATE:index_vcf (1) [100%] 1 of 1 ✔ [0e/609915] process > green_varan (1) [100%] 1 of 1, failed: 1 ✘

[2022-06-13T10:38:41] - INFO: Reading config from file: prioritize_smallvars.json [2022-06-13T10:38:41] - INFO: N selected chromosomes: 25 [2022-06-13T10:38:41] - INFO: N selected genes: 0 [2022-06-13T10:38:41] - INFO: Update existing gene annotations: true [2022-06-13T10:38:41] - INFO: Filter mode active: false [2022-06-13T10:38:41] - INFO: === Start processing VCF ===

Command error: [E::bcf_hdr_read] Input is not detected as bcf or vcf format /project/alfredo/GAU_tools/GREEN-VARAN/src/greenvaran.nim(40) greenvaran /project/alfredo/GAU_tools/GREEN-VARAN/src/greenvaran.nim(37) main /project/alfredo/GAU_tools/GREEN-VARAN/src/greenvaran/smallvars.nim(99) main /project/alfredo/software/nim_packages/pkgs/hts-0.3.21/hts/vcf.nim(238) open Error: unhandled exception: [hts-nim/vcf] error reading VCF header from 'GRCh38.test.smallvars.tmp.vcf.gz' [OSError]

########################################

but I think there are any problem in Annotate step without error-message. tmp vcf file(input of green_varan) in ANNOTATE:annotate_vcf is empty. what shold I do to fix it ?

Thank you!

edg1983 commented 2 years ago

Hi,

Usually this happens when there is something wrong with one of the annotation sources provided, like the file is incomplete or corrupted.

You can try this command (it should work from the main GREEN-VARAN folder) to manually run the annotation step and see if there is any error. I assume there is something unexpected going on at this stage so that the intermediate annotated VCF ends up empty.

GOGC=2000 IRELATE_MAX_CHUNK=10000 IRELATE_MAX_GAP=1000 \
workflow/bin/vcfanno -p 10 tomlfile.toml test/VCF/GRCh38.test.smallvars.vcf.gz \
| bgzip -c > annotated.tmp.vcf.gz

By default the local command runs with 10 parallel threads (-p 10). So if your local system has few than this, remember to adjust this to a lower sensible value.

This will generated an annotated VCF (annotated.tmp.vcf.gz) which should contain all the variants seen in the input.

As tomfile.toml use the .toml file you have in the results folder. The vfcanno executable is in workflow/bin in the GREEN-VARAN tool folder.

I suspect you will see an error here... Unfortunately, some of these errors do not return a proper error code and thus the pipeline erroneously assume the annotation completed successfully and tries to go on with the GREEN-VARAN step.