uschwartz / nucMACC

Pipeline to call hyper and hypo accessible nucleosomes and nucleosomes with non-canonical structure based on differential MNase-seq data
MIT License
3 stars 1 forks source link

read depth question/issue #14

Open davhum opened 1 week ago

davhum commented 1 week ago

I have an error when that relates to the get_nucMACC_scores.R script. The default depth filter coded within this script is 30 reads - which I understand was selected to keep high confident data. Wondering if this filter could be implemented so that it could be set to a different value. Some of the data set I am analyzing is low depth (less than 10 million reads per sample), which means all nucleosome positions are filtered out with the 30 read filter - and thus causing script to fall over. I understand that at this depth it is likely to be difficult to obtain robust nucleosomal positioning but was hoping pipeline might provide some insights. Would be interested in your thoughts if lowering this filter might still provide some insights to if the samples is OK - This information would be valuable and perhaps encourage the lab to acquire more sequencing data for low input samples?

uschwartz commented 1 week ago

To really get robust results of MNase-seq data, high read depth is required. We discussed this in more detail in the nucMACC Paper (https://doi.org/10.1126/sciadv.adm9740, see Fig 6). Nevertheless, we made the experience that if we do not look globally, but rather at distinct sites (e.g. known tf binding sites) and aggregate the signal (similar as Fig 8) we do get meaningful results at lower read depth. Therefore, we are working on a stable release featuring read filter option and make additional changes to make the pipeline more efficient.
Hope we are releasing that soon. If you cannot wait you could change the get_nucMACC_scores.R or the get_sub-nucMACC_scores.R scripts at the line 34 (or 47 for sub-nuccMACC) raw.flt<-30 , which should do the job