Open willbradshaw opened 1 week ago
I've done this using two different methods, both using intermediate pipeline outputs to get stats and add them to the HV hits table.
My code is in the form of a quarto doc using Bioconductor and tidyverse code (and is private to the NAO because it's using non-public data; @harmonbhasin please message me for a link). I imagine what we'd want to do is edit the corresponding (python?) scripts for the HV workflow to use similar logic to add these three fields to the HV table.
One of two new capabilities I'd like to add to our core pipeline is analysis of fragment lengths (the other being duplication analysis). As with duplication levels, there are several ways of doing this we could try, and it's possible we'll want to do it multiple ways. Will probably want input from @mikemc on the best way to approach this.