Closed pvanheus closed 3 years ago
--partition is "what is a sample ?" : should we use the SM field (default) in the RG header ? should we use another field in the RG header like the LB (library) etc... --samples is to limit the result above in case where you have more that one result (multiple LB but you only want to display one)
So if --samples
is a comma-separated list of identifiers (e.g. several libraries or several sample names), multiple figures will be generated within a single plot file?
multiple figures will be generated within a single plot file?
no, it's just for filtering, only reads with matching --samples
will be used.
but I think you can ignore both options for a simple wrapper.
Agreed, users could also prefilter their data to the read groups they want with e.g. samtools view. Thanks for the clarification!
Thanks @lindenb - so in terms of partition options:
samples
means match on SM, library
means match on LB, platform
means match on PL
What do sample_by_platform
, sample_by_platform_by_center
, any
and readgroup
mean?
same logic as GATK: https://gatk.broadinstitute.org/hc/en-us/articles/360051307491-DepthOfCoverage-BETA-#--partition-type
sample_by_platform : RG/SM+RG/PL , sample_by_platform_by_center : RG/SM+RG/PL+RG/CN readgroup: RG:ID any: everything
Subject of the issue
I am writing a Galaxy wrapper for jvarkit WGSCoveragePlotter but I am confused at how to use
--samples
and--partition
. Do you perhaps have some sample data for which these flags are appropriate? My main use case is in pathogen genomics, where I typically have a single sample in a BAM, so I am holding back on including support for these flags for now, but would like to include them in the future.Your environment
${JAVA_HOME}
(not used)