Closed aryarm closed 4 years ago
WASP can read sample names from the VCF. STAR should be able to do that too. Do we really need a samples file for these pipelines?
Note that the sample files usually request a 1000 genomes ID and a sample ID. The "1000 genomes ID" must match the ID's in the provided VCFs but the sample ID can refer to whichever wildcards.sample
you are using.
You can also pass --samples
a comma-delimited string, which may be good if we want to pass it in using python.
can we store the VCF_SAMP_ID in a better way than is currently being done with Snakefile-counts
?
perhaps using a custom python function?
this way, the default could be that you wouldn't have to specify VCF_SAMP_ID
To be fair, the sample file itself could be replaced by a python function. Maybe look into this? It could be an option?
requiring that users know/write python functions is a bit much
the current requirement, of providing samples files may be overly verbose but it's simple and straightforward
so I'm going to leave the current behavior as is and close this issue. I will reopen this issue later if I have a change of heart
There are a lot of inputs in the config files. Find out whether you can get rid of some of them (like config-WASP in Snakefile-counts) or get the info you need from a smaller number of files.