EichlerLab / pav

Phased assembly variant caller
98 stars 8 forks source link

no result found in the analysis dir #1

Closed Chenglin20170390 closed 3 years ago

Chenglin20170390 commented 3 years ago

Hi, I used config.json for input, after running snakemake -s ../Snakefile I got nothing in my directory (no error ), can you tell me how to do to run this analysis, or can you provide some example data, such as short test data, config.json, and assemblies.tsv? It's would be easy for others to study. Many thanks~ config.json { "reference": "/public/agis/huangsanwen_group/chenglin/softwares/syri/syri/bin/test/ref.fa.gz", "asm_pattern": "/public/agis/huangsanwen_group/chenglin/softwares/pav/analysis/{hap}.fa.gz" } image

paudano commented 3 years ago

The default rule only works if you have assemblies.tsv with one assembly per row. It's a tab-delimited file with three fields: NAME, HAP1, and HAP2 (header required as the first line. NAME is a sample name, and HAP1 and HAP2 are paths to the haplotype 1 and haplotype 2 FASTA files. The default rule will then work.

The "asm_pattern" needs a {sample} in the pattern if you use conifg with "asm_pattern" (not needed if you use assemblies.tsv).

I just added a VCF writer that will make things easier (pull the latest commit). You can run one sample by having Snakemake generate pav_NAME.vcf.gz (where NAME is your assembly name, i.e. "NAME" in assemblies.tsv or "asm_pattern" in config.json. If you have "assemblies.tsv", a VCF will be generated for each assembly in the file.

Let me know if this helps or if you need a full example.

Chenglin20170390 commented 3 years ago

okay, many thanks for your detailed explanation~