38 / d4-format

The D4 Quantitative Data Format
MIT License
150 stars 20 forks source link

d4tools stats - Show statistics across multiple regions #69

Open pontushojer opened 1 year ago

pontushojer commented 1 year ago

Hi,

I want to look at the mean expression over a number of regions defined in a BED file. Currently using d4tools stat with --region output the mean for each separate regions. It would be nice to also get an additional line with the mean expression across all regions defined in the BED file. Something like:

chr1    10000   11014   98.38560157790927
chr1    11031   11058   0
chr1    11079   11453   33.86631016042781
chr1    15792   15854   10053.435483870968
chr1    16707   16749   1592.952380952381
chr1    19300   19448   2727.9189189189187
chr1    20823   20868   4395.555555555556
chr1    26448   26470   31438
chr1    29739   29797   5912.258620689655
total                                  103.0920139090

This does however make the output a bit inconsistent. One other option would be to add a separate flag that only outputs the total, something like ---across-regions. What do you think?