Open TCLamnidis opened 1 week ago
Something like:
x="potato.bed;banana.bed;tomato.bed"
y=Channel.of(x)
.flatMap{
x ->
def y=x.split(';')
y
}
.view()
potato.bed banana.bed tomato.bed
These can then be separately input into genotyping and produce their own genotypes, or get catted to produce one superset?
It might be nice to be able to genotype on multiple SNP sets in a single run. I'm specifically thinking of pileupcaller here, not sure how it would apply to other genotypers, but:
Currently, the reference sheet takes one
pileupcaller_{bed,snp}
per reference. That means that if one wanted to genotype on two sets of positions, they would need to run the entire pipeline twice, or duplicate a row in the reference sheet just for that additional genotyping. Now, since the latter option will not fly with the ref-sheet validation, one would have to "fake" an entire new reference, thus duplicating all the processing, just for the extra genotypes.Solution: Maybe we can turn the pipleupcaller_bed/snp columns into a list column, e.g. multiple files separated by
;
, that would then get split into separate channel elements with the same meta, and thus only duplicate the genotyping step?