Ensembl / WiggleTools

Basic operations on the space of numerical functions defined on the genome using lazy evaluators for flexibility and efficiency
Apache License 2.0
142 stars 24 forks source link

gt not working properly after bin and scale #55

Closed gevro closed 3 years ago

gevro commented 3 years ago

Hi, This command is not working. It is giving unexpected and incorrect results. It should sum in bins of 20 bp, then scale by 0.05, then filter only for bins with value > 0.5. But this is not what is happening. Thanks.

wiggletools bin 20 scale 0.05 gt 0.5 file.bw

Perhaps it is not operating on the binned output and is still operating on the wiggle output? If so, it might be nice to have a way to operate on binned output in iterator chains.

dzerbino commented 3 years ago

Dear @gevro ,

If your intention is to run gt after bin and scale, then you should invert the order of the commands, into:

wiggletools gt 0.5 scale 0.05 bin 20 file.bw

Regards,

Daniel

gevro commented 3 years ago

It's not working for some reason. See below. The gt command should simply behave like a filter.

No gt command:

$ wiggletools write_bg - bin 100 scale 0.01 fillIn blah.windows.bed k24.umap.sorted.bw | head  
chr1    0   100 0.137970
chr1    100 200 0.702080
chr1    200 300 0.210000
chr1    300 1800    0.000000
chr1    1800    1900    0.079950
chr1    1900    2300    0.000000
chr1    2300    2400    0.149580
chr1    2400    2500    0.180420
chr1    2500    2600    0.220080
chr1    2600    2800    0.000000

With gt command

$ wiggletools write_bg - gt 0.5 bin 100 scale 0.01 fillIn blah.windows.bed k24.umap.sorted.bw | head
chr1    100 200 1.000000
chr1    14800   14900   1.000000
chr1    185000  185100  1.000000
chr1    203800  203900  1.000000
chr1    205200  205500  1.000000
chr1    205900  206000  1.000000
chr1    209900  210000  1.000000
chr1    212600  212800  1.000000
chr1    214600  214700  1.000000
chr1    215100  215200  1.000000

Perhaps the gt filter just provides a '1.0000000' value for all regions that pass the filter? If so, would be help to add to the documentation, and perhaps have an option to keep the values instead of a generic '1.0000000'.

dzerbino commented 3 years ago

Hello @gevro ,

Indeed, gt and lt are meant to return booleans, like < and > would do in most programming languages.

I have added more details in the README.

Cheers,

Daniel

gevro commented 3 years ago

Hi, I tried reinstalling with brew, but gte and lte are not there yet. Has it been updated on brew?

dzerbino commented 3 years ago

No, it's not updated yet, sorry.