provide flexibility for specifying samples

aryarm / as_analysis

A complete Snakemake pipeline for detecting allele specific expression in RNA-seq

MIT License

10 stars 9 forks source link

provide flexibility for specifying samples #18

Closed aryarm closed 4 years ago

aryarm commented 6 years ago

There are a lot of inputs in the config files. Find out whether you can get rid of some of them (like config-WASP in Snakefile-counts) or get the info you need from a smaller number of files.

aryarm commented 6 years ago

WASP can read sample names from the VCF. STAR should be able to do that too. Do we really need a samples file for these pipelines?

Note that the sample files usually request a 1000 genomes ID and a sample ID. The "1000 genomes ID" must match the ID's in the provided VCFs but the sample ID can refer to whichever wildcards.sample you are using.

aryarm commented 6 years ago

You can also pass --samples a comma-delimited string, which may be good if we want to pass it in using python.

aryarm commented 6 years ago

can we store the VCF_SAMP_ID in a better way than is currently being done with Snakefile-counts?

perhaps using a custom python function?

this way, the default could be that you wouldn't have to specify VCF_SAMP_ID

aryarm commented 6 years ago

To be fair, the sample file itself could be replaced by a python function. Maybe look into this? It could be an option?

aryarm commented 4 years ago

requiring that users know/write python functions is a bit much

the current requirement, of providing samples files may be overly verbose but it's simple and straightforward

so I'm going to leave the current behavior as is and close this issue. I will reopen this issue later if I have a change of heart