Open magicDGS opened 6 years ago
@magicDGS In addition to plugin framework changes, I think this would require parser changes. Currently the parser assumes all command line argument names are unique and independent (in the sense that command line order doesn't matter). This would require some kind of name qualification or grouping mechanism so that --int-filter-tag NM --int-filter-tag-value 2
would target one instance of a class, and --int-filter-tag AS --int-filter-tag-value 3
would target a different instance.
I see the problem with that, and I don't have any concrete solution. An idea is use plugin as tagged arguments or something similar. For example, --myReadFilter IntegerTagReadFilter:NM=2 --myReadFilter IntegerTagReadFilter:AS=3
. This might require to change how plugins are handled and provide a way to populate @Argument
from tag-like strings and show them like that in the cli help.
If its mostly about the summary counts, it might be easier to explore ways to allow read filters to have custom counting and summary display behavior.
@cmnbroad - I set that as an example, but I am implementing other plugins to compute statistics from reads. As a simple example, counting separately the number of reads with NM=2, NM=3 and AS=4 per window; as I said, I can always implement with List
arguments, but I rather prefer to have an instance of each of them and a common implementation of a simple "counter" for a tag-value pair.
The normal use-case for plugin descriptors in GATK is to have a single instance of each class to apply to data, and provide twice the same class is not allowed (e.g., read/variant filters). Nevertheless, there are some cases that it makes sense to have two different instances for the same class to apply to the data with different arguments, which I do not find the way to implement with the current system.
As an example, let's say that we would like to implement a different
ReadFilter
plugin from GATK and a filter calledIntegerTagReadFilter
with two arugments:--int-filter-tag
(String) and--int-filter-tag-value
(Integer). Thus, the user should be able to provide the following command line:--myReadFilter IntegerTagReadFilter --int-filter-tag NM --int-filter-tag-value 2 --myReadFilter IntegerTagReadFilter --int-filter-tag AS --int-filter-tag-value 3
.This can be implemented as a single filter with the arguments being specified as
List<String>
andList<Integer>
and follow the same implementation as GATK'sReadFilter
. Nevertheless, this is not desirable in this case because we would like to keep every instance separated to be able to count the number of reads in each filter.This is just a toy example, but it does not look that the plugin system allows this kind of implementations. Maybe I am missing something about it...