tforest / popsize

Snakemake module for the snpArcher pipeline for pop. size change inference.
6 stars 0 forks source link

Pipeline error and input file questions #1

Open epifaniarango opened 2 months ago

epifaniarango commented 2 months ago

I am trying to run the popsize module. However, I didn't create my vcf files with snpArcher, which is becoming quite challenging. I am running the software, and I get the following error: No rule to produce popsize (if you use input functions make sure that they don't raise unexpected exceptions). The vcf file is located in snpArcher/results/hs37d5/yoruba_raw.vcf.gz, and the sample.csv file is the following: BioSample,refGenome,refPath HGDP00920,hs37d5,/scratch/earang/snpArcher/workflow/modules/popsize/data/hs37d5.fna HGDP00924,hs37d5,/scratch/earang/snpArcher/workflow/modules/popsize/data/hs37d5.fna HGDP00925,hs37d5,/scratch/earang/snpArcher/workflow/modules/popsize/data/hs37d5.fna HGDP00926,hs37d5,/scratch/earang/snpArcher/workflow/modules/popsize/data/hs37d5.fna

I am quite confused about the pipeline in general. I also couldn't find information about whether the data needs phasing (which I assume it does, as it is required by msmc2) or how to input population information for each sample.

Thanks in advance for your time

tforest commented 2 months ago

Hello! Thanks for trying the pipeline, that is very useful feedback. Could you please provide the snakemake command that you used to execute the pipeline? Can you also provide a more detailed error log? For your msmc2 question, it will be treated as unphased by default. It's still a work in progress but It would be nice to deal with that. Many thanks!