zavolanlab / mirflowz

Snakemake workflow for the mapping and quantification of miRNAs and isomiRs from miRNA-Seq libraries.
MIT License
6 stars 1 forks source link

feat: modify ASCII-style alignment pileups aesthetics #145

Open deliaBlue opened 2 months ago

deliaBlue commented 2 months ago

From the output ASCII-style alignment pileups modify its aesthetics according to the following parameters:

--sort-by: specify the sort type (either "position" or "counts". Default: "position")

--reverse-sort: reverse the sort order (from "from-left-to-right" to "from-right-to-left" for sort type "position" and from "descending" to "ascending" for sort type "counts")

--min-count: minimum count for a sequence to be kept. Default: 0

--max-sequences: maximum number of sequences with the highest counts to be displayed. Default 30.

--keep-all: keep pileups even if it has no mapped sequences.

Right now, MIRFLOWZ runs the pileup script three different times: once per library, once per run and (if specified) once per group. Therefore, in order for the user to set a custom counts threshold, the configuration file will include two new parameters, in the form of dictionaries: min_counts and max_sequence. These new dictionaries will have as keys per_lib, per_run and per_group, and as default values the ones specified above for the parameters --min-count and --max-sequences respectively.

uniqueg commented 2 months ago

Given that you have two separate scripts for the ASCII and HTML generation, I'm tending to think that the ASCII should always produce complete outputs and should not be filtered. Instead, filtering could be applied to the HTML generation. That way, we separate concerns - one script generates the pileups, the other one creates publication-style visualizations from them.