Another thought which might be out of scope but is helpful for interpretation, many of these motifs are very similar (ex GATA and SP1), and might lead a researcher to think there is a ‘family’ effect when it really is just ambiguity due to sequence similarity. Possible something along the lines of similarity distance for the motifs? There are a few ways to calculate this and I know this would take some looking into, but just a thought.
Can be solved by using HOMERs compareMotifs.pl function (uses Pearson correlation as similarity measure). I tested the function, works well. Could implement this pretty easily.
Another thought which might be out of scope but is helpful for interpretation, many of these motifs are very similar (ex GATA and SP1), and might lead a researcher to think there is a ‘family’ effect when it really is just ambiguity due to sequence similarity. Possible something along the lines of similarity distance for the motifs? There are a few ways to calculate this and I know this would take some looking into, but just a thought.