Closed Bernadetadad closed 7 months ago
@Bernadetadad, The output CSVs for the summaries (note there is also a per-sera output) have applied the times_seen
and min_frac_models
filters, as well as any other filters in the le_filters
specification. These are construed as filters on data quality, and are applied to all data before writing the CSVs and making any plots (at least, that should be what happens---and looking at the code I think it is, raise an issue if not).
But it does not apply things like the init_min_value
sliders for various properties: those are sliders that set what is shown by default, but do not filter what is in the CSV as the plot can be re-adjusted to show them.
So right now there is a distinction between hard filters applied to the data before it even goes into the plot, and the sliders.
It may be possible to apply filters like the functional scores one as hard filters using le_filters
, do you want me to look into that and post an example?
Note also (possibly related) that I have an issue open to create more than one summary plot so you can have multiple differently configured ones. If that is a priority that would help you, make a comment in that issue and I will try to prioritize it.
I'm going to close this issue. I think the current behavior is correct, and not really a bug. That is because there is a conceptual difference between filters that remove low-quality measurements, and mutations that are deleterious to cell entry. We show them differently on the plots: the former are shown as missing, the latter are grayed out.
I think if you want to remove them from the CSV, write a separate rule to do that.
Re-open and explain if you disagree, @Bernadetadad
I thought
summary.csv
supposed to have filtered data based on the params set insummaries_config.yml
But for example here a summary.csv file from flu repo still has some sera selection values for mutations that have lower than -3 functional score in 293T cells (even though in plots those mutations are greyed out)? This is the case for summaries in other repos as well. Am I misunderstanding what's supposed to be insummary.csv
file?