nf-core / phaseimpute

MIT License
16 stars 13 forks source link

Fix test profiles #6

Closed nschcolnicov closed 6 months ago

nschcolnicov commented 7 months ago

Description of the bug

Test profiles currently contain local paths, i.e test_full.config

image

The test profiles that need correcting are: test_full.config test_panelprep.config test_sim.config test.config

Command used and terminal output

No response

Relevant files

No response

System information

dev

atrigila commented 7 months ago

Example of issues: CSVs point to local files and cannot be used for testing: e.g. /groups/dog/llenezet/test-datasets/data/panel/21/panel_2020-08-05_chr21.phased.vcf.gz

LouisLeNezet commented 7 months ago

Sorry I didn't yet implement a big test as I needed first a reliable datatest set. We should look at how it is done in other pipeline to know where big files are stored.

LouisLeNezet commented 7 months ago

Hi, Normally the nextflow run main.nf -profile test,singularity --outdir results should now work without any problem.

atrigila commented 7 months ago

Hi @LouisLeNezet, here are some ideas of full sized datasets. I implemented the 1000G s3 in the quilt pipeline.

LouisLeNezet commented 7 months ago

For the fasta it is ready, same for reference panel with the #18 PR. For the sample the NA12878 is easily accessible but the problem reside in the presence of this individual in the reference panel as well as its parents. For a full test it will imply to duplicate the huge files to remove them to not overestimate the performance of the imputation. The best would be to have a unrelated bam file at high coverage from outside the 1000 Genome Project. The GATK resources seems interesting but there is only the NA12878 individual available...