Closed vicfabienne closed 1 year ago
There's a distinction between the files tests/setup_test_settings.yaml
and tests/settings.yaml
.
Both are supposed to use the same data (i.e. reads) and only check if everything runs. The test read dataset only completes with an unrealistically low mutation coverage threshold.
tests/settings.yaml
is the file used for performing tests during development, like during make distcheck
. It has the necessary modified download paths with the downsampled datesets that we use for testing. I will have to write later what exactly is downsampled and how that affects results.
tests/setup_test_settings.yaml
is used to check whether the pipeline has been set up correctly by the user. (See also the Quick Start section in the PiGx-Docs). When the user installs the pipeline they may not have access to the default database dir, was the idea.
About the downloads: tests/setup_test_settings.yaml
(unless modified) downloads the official datasets, as specified in the default etc/settings.yaml.in
file.
Sorry I may was not specific enough.
I understand the test dataset stuff with the reads.
My question was about the databases.
When the pipeline is installed using guix, are the databases shiped with the package or do they still have to be installed manually?
Also - what is the current way to run a quick test example when the packages is installed over guix? Is there some inbuilt functionality for it or would the user need to download the test directory?
The test databases are not installed in the Guix package. They are unpacked in /tmp/.local/share/pigx/databases
, which is destroyed once the build is complete.
Ok so all database downloads are either done by the user or by pipeline scripts, Guix doesnt figure into that. In order to quickly run something you do actually have to download the test dir if you dont have it already. ~That isn't mentioned yet in the docs atcually, so good point ^^~ That is actually the whole point of the Quick start section. (I was in a bit of a hurry when I wrote the previous stuff)
Closing, the last explanation should be clear enough.
Hey, for the latest version it is not quite clear wether or not the databases (vep-db, kraken-db etc.) still has to be downloaded manually or not.
I see that guix automatically downloads something and I see here https://github.com/BIMSBbioinfo/pigx_sars-cov-2/blob/main/tests/setup_test_settings.yaml that it can be used for running tests. But based on the comment in the yaml: are those the full databases now? And if so which version? Or can it only be used for testing and it would not give meaningful results when used with real data?
@jonasfreimuth please clarify and then I can add it to the README and docs