HighlanderLab / tree_seq_pipeline

Pipeline to infer tree sequences with different datasets
MIT License
3 stars 7 forks source link

improve testing #70

Open hannesbecher opened 11 months ago

hannesbecher commented 11 months ago

Thorough testing will simplify development in the long run. It will help us to pick up mistakes sooner and make sure the pipeline runs on various datasets not only our personal favourites.

I think it would be good to (1) add a test for the pipeline to start with multiple VCF files (one per chromosome) and (2) to add the code to run this to README.md.

Specifics:

I know I am supposed to work on the chromosome names... coming soon.

janaobsteter commented 11 months ago

@hannesbecher , there is already additional files there. The previous test (when we didn't have ancestral inference in the pipelien) started with split and combined files. BUT the problem is that the ancestral allele inference is not adopted to start from multiple chromosomes. We discussed that we will move the ancestral inference to after the merge (I've already started to work on this).

However, if you already have ancestral alleles file, you can test this on those files are here: "HighlanderLab/share/Snakemake/Data/RawVCF/TestChromosomeData"

janaobsteter commented 11 months ago

As mentioned, the ancestral allele inference will move as planned in #61

hannesbecher commented 11 months ago

Thanks for clarifying!