jvanheld / IBIS_2024

Participation to the IBIS nebchmarking for motif discovery approaches
GNU General Public License v3.0
0 stars 0 forks source link

Peak-motif option to avoid purging #2

Closed jvanheld closed 2 months ago

jvanheld commented 4 months ago

With Znf362, we notice that the positional profiles of nucleotides show a strong valley in the center, for all the nucleotides. The same holds for all dinucleotides.

image

This results from the purge-sequence step, which masks repeated elements with Ns. The motifs discovered in these peaks are all AT-rich low complexity motifs;

To test : add an option -no_purgeto peak-motifs and check if

jvanheld commented 4 months ago

I added the option -nopurge to peak-motifs

This indeed completely changes the shape of the nucleotide positional distribution profiles.

image

The effect is also very clear on dinucleotide compositions.

With purged sequences

Strong depletion in the middle for all dinucleotides

image

With the option -nopurge

Peak of AA and TT in the middle, but no peak for AT, this is thus not a simple enrichment in A+T

image

jvanheld commented 4 months ago

The motifs discovered with peak-motifs are completely different depending on whether the option '-nopurge' is activated or not.

Motifs discovered on purged peaks

Analysis: IBIS24_leaderboard_ZNF362_THC_0364.Rep-DIANA_0293 (28/06/2024 13:26)

image

Motifs discovered on unpurged peaks

Analysis: IBIS24_leaderboard_ZNF362_THC_0364.Rep-DIANA_0293 (06/07/2024 11:51)

image
jvanheld commented 4 months ago

@brunocontrerasmoreira and @najlaksouri

Interesting to see the differences between motifs discovered with / without purging the peak sequences in this particular peakset. I wonder if we will see so drastic differences with other peaksets as well.