MiraldiLab / maxATAC

Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
Apache License 2.0
25 stars 8 forks source link

Best value to use for thresholds #91

Closed tacazares closed 1 year ago

tacazares commented 2 years ago

I did a pilot study several months ago that looked at the differences that result from changing the precision of the thresholds for our predictions. When we round our thresholds to different decimal places, we have changes in the max Recall and AUPR for our models. There is not a large difference, but the difference can change the recall of the data.

The results show the changes in performance for CTCF based on rounding the threshold.

Screen Shot 2022-02-16 at 10 44 02 PM

emiraldi commented 2 years ago

Given the quality of our gold standard, I think 4-5 floating points is reasonable, maybe 5 to be safe. I'd only trust our precision, recall and AUPRs to the hundredth place anyway. E.g. only report AUPR = .45...