Open ctb opened 3 years ago
it would also be nice to support explicit naming from filename, and/or basename, and/or maybe even accession from a CSV of some sort.
I REALLY like adding a name
option in the param string!
template vars seem very handy, but maybe also dangerous?
name from csv is what I end up doing via snakemake, so doing it natively would be neat :)
also, see @taylorreiter comment in https://github.com/dib-lab/sourmash/pull/1283/files#r572495952 -
docs say:
You can also stream any of these formats into
sourmash sketch
via stdin by using-
as the input filename.
@taylorreiter -
Yes, that's true, but then the name of the sig is recorded as - which is really confusing when you compare a bunch of files.
Also, should there be an example for how to do this?
also, see @taylorreiter comment in https://github.com/dib-lab/sourmash/pull/1283/files#r572495952 -
docs say:
You can also stream any of these formats into
sourmash sketch
via stdin by using-
as the input filename.@taylorreiter -
Yes, that's true, but then the name of the sig is recorded as - which is really confusing when you compare a bunch of files.
fixed in #1347 - name/filename is now empty.
Also, should there be an example for how to do this?
added in 2ac0b967!
we could allow
sourmash sketch
to takename=
in param strings, e.g.sourmash sketch dna -p k=31,name='cool name, luke'
rationale: when writing up the docs for sourmash sketch per https://github.com/dib-lab/sourmash/pull/1283#pullrequestreview-586095844, I realized that I had done signature naming the way I had because of limitations imposed by
sourmash compute
: to whit, that we could only specify one name on the command line for all the signatures being created.However, with
sourmash sketch
, we create different signatures for each param string.In a major scope expansion of this issue, we could also allow template variables like
{header}
and{len}
to be used, to be interpreted by Python for each sequence...