Closed jkbonfield closed 3 months ago
Thank you for this! Can you update the documentation as well please, doc/bcftools.txt?
So this change would, for example, allow -W=tbi
to work as a short option. It's not really the way short options are meant to be used with optional arguments however, but I feel it's less likely to be a source of confusion and questions than going with the official nomenclature of -Wtbi
(which still works).
It simplifies the usage too, as we wouldn't need to split it into -W[FMT], --write-index[=FMT]
.
It appears the frequent strategy though, if we want both long and short opts, is normally not to make them have optional args and have two options. Eg -W, --write-index
plus --index-fmt FMT
.
Thoughts?
diff --git a/vcfview.c b/vcfview.c
index c53ca1d0..43845af7 100644
--- a/vcfview.c
+++ b/vcfview.c
@@ -550,7 +550,7 @@ static void usage(args_t *args)
fprintf(stderr, " -u/U, --uncalled/--exclude-uncalled Select/exclude sites without a called genotype\n");
fprintf(stderr, " -v/V, --types/--exclude-types LIST Select/exclude comma-separated list of variant types: snps,indels,mnps,ref,bnd,other [null]\n");
fprintf(stderr, " -x/X, --private/--exclude-private Select/exclude sites where the non-reference alleles are exclusive (private) to the subset samples\n");
- fprintf(stderr, " --write-index[=FMT] Automatically index the output files [off]\n");
+ fprintf(stderr, " -W, --write-index[=FMT] Automatically index the output files [off]\n");
fprintf(stderr, "\n");
exit(1);
}
@@ -617,11 +617,11 @@ int main_vcfview(int argc, char *argv[])
{"phased",no_argument,NULL,'p'},
{"exclude-phased",no_argument,NULL,'P'},
{"no-version",no_argument,NULL,8},
- {"write-index",optional_argument,NULL,10},
+ {"write-index",optional_argument,NULL,'W'},
{NULL,0,NULL,0}
};
char *tmp;
- while ((c = getopt_long(argc, argv, "l:t:T:r:R:o:O:s:S:Gf:knv:V:m:M:aAuUhHc:C:Ii:e:xXpPq:Q:g:",loptions,NULL)) >= 0)
+ while ((c = getopt_long(argc, argv, "l:t:T:r:R:o:O:s:S:Gf:knv:V:m:M:aAuUhHc:C:Ii:e:xXpPq:Q:g:W::",loptions,NULL)) >= 0)
{
char allele_type[9] = "nref";
switch (c)
@@ -750,7 +750,7 @@ int main_vcfview(int argc, char *argv[])
break;
case 9 : args->n_threads = strtol(optarg, 0, 0); break;
case 8 : args->record_cmd_line = 0; break;
- case 10 :
+ case 'W' :
if (!(args->write_index = write_index_parse(optarg)))
error("Unsupported index format '%s'\n", optarg);
break;
diff --git a/version.c b/version.c
index ea112f60..cf2c6d51 100644
--- a/version.c
+++ b/version.c
@@ -119,9 +119,9 @@ int write_index_parse(char *arg) {
int fmt = HTS_FMT_CSI;
if (arg) {
- if (strcmp(arg, "csi") == 0)
+ if (strcmp(arg, "csi") == 0 || strcmp(arg, "=csi") == 0)
fmt = HTS_FMT_CSI;
- else if (strcmp(arg, "tbi") == 0)
+ else if (strcmp(arg, "tbi") == 0 || strcmp(arg, "=tbi") == 0)
fmt = HTS_FMT_TBI;
else
return 0;
Ok I added -W in there too.
I'm unsure if we want to support -W=tbi
as well as -Wtbi
, but personally I dislike the latter and prefer the former as it shows consistency with long options and is distinct from the mandatory option versions. Either way, it's a trivial couple lines to tweak in version.c so I'll let you decide (and a whole raft of search and replace in the docs).
Also fixed a couple other doc issues in the process.
Note I didn't change the usage statements to be more verbose. IMO it doesn't need it, and I dislike huge long lines (some of them are 160 characters, which breaks most unix conventions). I don't think you'd like me line-wrapping them either, as we've had that discussion in the past. So again, if you want that doing, I'll let you make that editorial decision at merge.
It appears the frequent strategy though, if we want both long and short opts, is normally not to make them have optional args and have two options. Eg -W, --write-index plus --index-fmt FMT.
My two cents would be to prefer this approach. But I appreciate my two cents should have very low market value. Just providing a user perspective...
It appears the frequent strategy though, if we want both long and short opts, is normally not to make them have optional args and have two options. Eg
-W, --write-index
plus--index-fmt FMT
.
Optional arguments are unexplored territory in bcftools and I don't know how likely they are to cause problems in practice. Adding the new --index-fmt
option seems like a more conservative and safer approach, but I don't insist.
This adds a new function
init_index2
with an additional index format field, which is set by another new functionwrite_index_parse
. I could have modified the existinginit_index
function, but I was unsure if this is meant to be a public API or not. Eg do people write their own plugins and use this? If not, we could just rename it back.The PR makes
--write-index
take an optional--write-index[=FMT]
syntax, eg--write-index=tbi
. As this is a long option, the optional argument is easy to define. (They don't work well on short opts, being space concious.) The default of not specified =FMT is whatever the tool did before. This is almost always CSI, except for isec which is TBI if VCF. I don't know why that single command has a different default, but I didn't want to second guess the logic so I left it unchanged.I tested all subcommands capable of writing indices, including plugins (although just fill-tags for the generic writing in plugin.c as it's shared by many). I also fixed a number of bugs, not all index related.
bcftools +scatter -n 100 --write-index
only generated an index for ther last file.bcftools isec --write-index
only worked when outputting a single file. It now honours the write-index when running in directory mode too, but as mentioned above for some reason single-file isec was always tbi so I kept this behaviour intact.bcftools filter -i 'type=INDEL'
no longer core dumps. (The correct syntax is type="INDEL")Fixes #2008