MCOT (Motifs Co-Occurrence Tool) is a software package for recognition of composite elements (CEs) in a single ChIP-seq dataset (Levitsky et al., 2019; Levitsky et al., 2020). CEs detected by MCOT include two potential binding sites of transcription factors (TFs) in all possible mutual orientations. MCOT considers CEs with a full/partial overlap of motifs or with a spacer in a certain range. Each potential CE recognized by MCOT contains the motif of immunoprecipitated TF in respective ChIP-seq experiment (anchor motif) and another motif (partner). Identical/distinct anchor and partner motifs imply the search for CEs of homotypic or eterotypic type (respectively). The web service WebMCOT Levitsky et al., 2022 implements MCOT software package.
MCOT implemented in C++ and it can be conventionally compiled in Linux or Windows operating system. To run MCOT user should compile the corresponding source code file. Files mcot_anchor.cpp and mcot.cpp respect to one-partner and many partners options for Position Weight Matrix (PWM) model of a binding site. File anchor_pro.cpp respects to one-partner option, but it runs with arbitrary models of site, including not-PWM ones, e.g. BaMM (Siebert and Söding, 2016), and SiteGA (Tsukanov et al., 2022)
(Linux) Run in terminal (Packages “build-essential” and “cmake” should be installed on Ubuntu system):
cd <project>
mkdir tmp
cd tmp
cmake ..
make
All executable files for one-partner, many-partners and anchor-pro options will be in src/anchor_vs_one, src/anchor_vs_many and src/anchor_pro
(Windows) Run in terminal (Win -> Visual Studio 2017 -> Visual Studio Tools -> VC -> Native Tools x64. Else, you should install “CMake” module while VS 2017 installing)
cd <project>
mkdir tmp
cd tmp
cmake ..
MSBuild mcot-kernel.sln /p:Configuration=Release /p:Platform=Win32
Programs anchor_vs_many
, anchor_vs_one
and anchor_pro
for one_partner, many_partners and arbitrary_models_one_partner
options should be in src/anchor_vs_one_partner/Release/
, src/anchor_vs_many_partners/Release/
and src/anchor_pro/Release/
The command line for one-partner option:
./anchor_vs_one <1 fasta> <2 anchor.motif> <3 partner.motif> <4 minimal spacer length> <5 maximal spacer length> <6 path to whole-genome promoters> <7 pvalue_thr> <8 -log10[p-value]_thr> <9 asymmetry_fold(-log10(ERR))>
The command line for many-partner option:
./anchor_vs_many <1 fasta> <2 anchor.motif> <3 partners.library> <4 minimal spacer length> <5 maximal spacer length> <6 path to whole-genome promoters> <7 pvalue_thr> <8 -log10[p-value]_thr> <9 asymmetry_ratio(-log10(ERR))>
<1 fasta>
= DNA sequences of ChIP-seq peaks in fasta format, a minimum recommended number of peaks is about 300-500, the maximum number is not restricted, however 5000-10000 or higher number of peaks requires a higher computation time than several thousands of peaks, hence about 1000-2000 peaks are enough. Sequences should have lengths substatially higher than lengths of recognition models for anchor and partner motifs to contain possible composite elememnts with an overlap or spacer.
<1 fasta>
= DNA sequences of ChIP-seq peaks in fasta format, the minimum recommended number of peaks is about 200-300, the maximum number is not restricted, however 10000 or higher number of peaks requires higher computation time than several thousands of peaks. Sequences should have lengths substatially higher than lengths of recognition models for anchor and partner motifs to contain possible composite elememnts with an overlap or spacer.
<2 anchor.motif>
, <3 partner.motif>
= frequency matrices of the first and second motifs motifs in the standard format, e.g.
***
> Motif name
0.001 0.001 0.942 0.056
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.548 0.001 0.400 0.051
0.589 0.087 0.245 0.079
0.231 0.001 0.001 0.767
0.001 0.001 0.001 0.997
0.001 0.381 0.001 0.617
0.001 0.964 0.001 0.034
0.001 0.891 0.001 0.107
***
<3 partners.library>
= for this parameter five options are available: “hs_core”, “mm_core”, “hs_full”, “mm_full” and “dapseq”. These values respect to the libraries of derived from ChIP-seq data Hocomoco core (396/353) and full (747/509) collections of motifs for human/murine TFs (Kulakovskiy et al., 2018); and the library of motifs derived from DAP-seq data Plant Cistrome collection of 514 motifs for A.thaliana TFs (O’Malley et al., 2016).
<4 minimal spacer length>
= integer value from 0 to \
<5 maximal spacer length>
= integer value from 0 to 100 (the default value 29)
<6 path to whole-genome promoters>
= a path to the whole-genome dataset of promoters, three folders “hs”, “mm” and “at” that imply application of H.sapiens, M.musculus and A.thaliana promoter datasets for setting of thresholds for input motifs.
<7 pvalue_thr>
= recognition threshold of motifs transformed to the logarithmic -log10(ERR) scale of Expected Recognition Rate (ERR), ERR is computed as a recognition rate for the whole-genome set o promoters of protein-coding genes, default value 0.0005
<8 -log10[p-value]_thr>
= threshold to display the significances of enrichment of CEs in output data (the default value 10)
<9 asymmetry_ratio(-log10(ERR))>
= the ratio ratio_thr restricting ERR values of two motifs in asymmetrical CEs, e.g. for two value ERR1 and ERR2 for certain CE means ratio_thr = Max{-log10(ERR1), -log10(ERR2)} / Min{-log10(ERR1), -log10(ERR2)}, and ratio > ratio_thr and ratio < ratio_thr mean asymmetrical and symmetrical CEs, respectively.
The command line for anchor_pro option:
./anchor_pro <1 file_fasta> <2 motif1.profile> <3 motif2.profile> <4 int motif1.length> <5 int motif1.length> <6 int motif1.table_thr_err> <7 int motif1.table_thr_err> <8 int spacer_min> <9 int spacer_max> <10double pvalue_thr> <11double -log10[p-value]_thr> <12double asymmetry_ratio(-log10(ERR))>
<1 file_fasta>
= DNA sequences of ChIP-seq peaks in fasta format
<2 motif1.profile>
= Profile for the first model, see example profile of model 1
<3 motif2.profile>
= Profile for the second model, see example profile of model 2
<4 int motif1.length>
= integer value, length of the first model
<5 int motif1.length>
= integer value, length of the second model
<6 int motif1.table_thr_err>
= Table Threshold vs. ERR for the first motif, see example of distribution for model 1
<7 int motif2.table_thr_err>
= Table Threshold vs. ERR for the second motif, see example of distribution for model 2
<8 int spacer_min>
= integer value from 0 to \
<9 int spacer_max>
= integer value from 0 to 100 (the default value 29)
<10 pvalue_thr>
= recognition threshold of motifs transformed to the logarithmic scale, -log10(ERR), default value 0.0005
<11 -log10[p-value]_thr>
= threshold to display the significances of enrichment of CEs in output data (the default value 10)
<12 asymmetry_ratio(-log10(ERR))>
= the ratio ratio_thr restricting ERR values of two motifs in asymmetrical CEs, e.g. for two value ERR1 and ERR2 for certain CE means ratio_thr = Max{-log10(ERR1), -log10(ERR2)} / Min{-log10(ERR1), -log10(ERR2)}, and ratio > ratio_thr and ratio < ratio_thr mean asymmetrical and symmetrical CEs, respectively.
MCOT requires (a) DNA sequences of ChIP-seq peaks and (b) anchor and partner motifs. We recommend application of a conventional de novo motif search tool, e.g. Homer (Heinz et al., 2010) and STREME (Bailey et al., 2021) to define an anchor motif.
MCOT have two options for definition of the partner motif:
a matrix of partner motif (one-partner option);
a library of known motifs (many partners option).
<anchor_pro>
requires input files Table Threshold vs. ERR for both models.
For a PWM model the respictive file can be taken as the output files of runs with <anchor_vs_one>
or <anchor_vs_many>
options, respecting to the anchor motif <err*.txt>. For a non-PWM model, the corresponding table should be deduced from the recognition profile of potential hits for the whole genome dataset of promoters of protein-coding genes, e.g. the SiteGA (Tsukanov et al., 2022) tool has a special option to compute the required table
Advanced options include:
MCOT with options one_partner, many_partners applies the recognition model of PWM for mapping motifs in peaks, otherwise for option anchor_pro MCOT takes ready mapping of predicted hits from a file.
For each model, MCOT uses five thresholds {T[1],...T[5]} according to the unified set of ERRs for a whole-genome dataset of promoters, e.g. for the option of command line <pvalue_thr>
= 5E-4, five ERRs are equal to {5.24E-5, 1.02E-04, 1.9E-4, 3.33E-4, 5E-4}. The profile of the most stringent hits contains matrix scores T ≥ T[1], the next profile comprises PWM scores {T} in the range T[2] ≥ T > T[1], etc. Hence, MCOT computes five profiles of hits with certain level of conservation for each input motif. Note that the change of the recognition threshold <pvalue_thr>
defines the fifth threshold (default value 5E-4) and proportionally shifts the rest four thresholds.
MCOT classifies CEs structure according to the following criteria:
For each of 25 combinations of motifs conservation and each computation flow MCOT compiles the 2x2 contingency table (Table 1) and compute the significance of Fisher’s exact test p-value(CE) that compares the content of CEs in ChIP-seq peaks with that for background model.
Table 1. 2x2 contingency table for calculation of CE significance.
Sequences without CEs | Sequences with CEs | Total, sequences with both motifs | |
---|---|---|---|
Observed (peaks) | $Obs_{CE-} = Obs_{Tot} - Obs_{CE+} $ |
$Obs_{CE+} $ |
$Obs_{Tot} $ |
Expected (permuted sequences) | $Exp_{CE-} = Exp_{Tot} - Exp_{CE+} $ |
$Exp_{CE+} $ |
$Exp_{Tot} $ |
The background model implies the preservation of content for each motif in each peak. MCOT generates background profiles of hits of anchor and partner motifs iteratively for each peak with a special permutation procedure.
MCOT computes CE significance separately for five computation flows: Any (Spacer or Overlap), Full, Partial, Overlap (Full and Partial), Spacer. Fisher’s exact test computes the CE enrichment for each of 5x5 combinations of conservation of motifs.
Finally, MCOT estimates the similarity of anchor and partner motifs with the motifs similarity p-value. MCOT used matrix column similarities measures PCC (Pearson Correlation Coefficient, Pietrokovski, 1996) and SSD (Sum of Squared Distances, Sandelin and Wasserman, 2004) to compute two p-values. If maximum among them was less than 0.05, then pair of motifs had the significant similarity. Hence, a respective CE may be a possible false positive prediction.
Additionally, MCOT computed significance of CEs with more conserved anchor motif and more conserved partner motif. These calculations are performed according to scheme represented above in Table 1, but MCOT counts either CEs with more conserved Anchor or Partner motifs, i.e. either -Log10[ERR(Anchor)] > -Log10[ERR(Partner)] or -Log10[ERR(Anchor)] ≤ -Log10[ERR(Partner)].
Finally, to estimate the asymmetry of CE, MCOT partitions all CEs on those with more conserved Anchor or Partner motifs and compute the significance of asymmetry with Fisher’s exact test (Table 2). The asymmetry significance -Log10[P-value] is printed as positive (with sign ‘+’) in the case of enrichment toward the Anchor motif, otherwise, MCOT sign ‘-’ (negative value) denotes the enrichment toward the Partner motif. These calculation yielded for each computation flow one p-value(CE, Asymmetry).
Table 2. 2x2 contingency table for calculation of CE asymmetry.
CEs with more conserved anchor motif | CEs with more conserved partner motif | CEs, total | |
---|---|---|---|
Observed (CEs in peaks) | $Obs_{CE, Anchor} $ |
$Obs_{CE, Partner} $ |
$Obs_{CE, Anchor} + Obs_{CE, Partner} $ |
Expected (CEs in permuted sequences) | $Obs_{CE, Anchor} $ |
$Obs_{CE, Partner} $ |
$Obs_{CE, Anchor} + Obs_{CE, Partner} $ |
Detailed enrichment or depletion of CEs with specific combinations of motifs conservation are represented in the scatterplot text file plot_*, see below.
To take into account multiple comparisons we applied the Bonferroni’s correction and used the following critical values to filter out not significant results:
MCOT gives the following output data:
Example
>Seq 1 Thr 0.864497 Nsites 1
205 3.304404 -
>Seq 2 Thr 0.864497 Nsites 0
>Seq 3 Thr 0.864497 Nsites 2
88 3.607778 -
160 3.338550 -
Here and below ChIP-seq data for mouse FoxA2 and CE FoxA2-HNF1β (Wederell et al., 2008) illustrate MCOT application. The anchor FoxA2 motif we deduced from de novo search Homer (Heinz et al., 2010) and the partner HNF1β motif we took from the mouse Hocomoco core collection (Kulakovskiy et al., 2018).
Example
0.99745670 1.63517e-07
0.99660894 1.88674e-07
0.99612450 4.02504e-07
...
0.81195753 0.0179257
0.81191716 0.0179369
0.81187679 0.0179517
Example
Motif Num | Motif Name | Threshold | % of peaks | Rec. peaks | Total peaks | Rate of hits | Rec. hits | Total positions |
---|---|---|---|---|---|---|---|---|
0 | Anchor | 0.955618 | 32.93 | 2484 | 7543 | 7.23E-04 | 2887 | 3994992 |
0 | Anchor | 0.945548 | 19.86 | 1498 | 7543 | 4.20E-04 | 1677 | 3994992 |
0 | Anchor | 0.934918 | 25.30 | 1908 | 7543 | 5.50E-04 | 2199 | 3994992 |
0 | Anchor | 0.923776 | 29.76 | 2245 | 7543 | 6.69E-04 | 2672 | 3994992 |
0 | Anchor | 0.913846 | 28.85 | 2176 | 7543 | 6.63E-04 | 2647 | 3994992 |
1 | Partner | 0.959810 | 9.35 | 705 | 7543 | 1.90E-04 | 760 | 4002535 |
1 | Partner | 0.947124 | 7.42 | 560 | 7543 | 1.49E-04 | 596 | 4002535 |
1 | Partner | 0.934158 | 11.93 | 900 | 7543 | 2.44E-04 | 978 | 4002535 |
1 | Partner | 0.923150 | 16.56 | 1249 | 7543 | 3.51E-04 | 1403 | 4002535 |
1 | Partner | 0.959810 | 18.60 | 1403 | 7543 | 3.94E-04 | 1575 | 4002535 |
The first block of output data for each pair of motifs represents (a) five P-values of CE enrichment in five computation flows; (b) for each heterotypic pair three P-value for similarity between Anchor and Partner motifs.
Example:
# Motif | Motif Name | Full overlap, -Log10[P-value] | Partial overlap,-Log10[P-value] | Overlap, -Log10[P-value] | Spacer, -Log10[P-value] | Any, -Log10[P-value] | Similarity to Anchor, -Log10[P-value] | Similarity to Anchor, SSD | Similarity to Anchor, PCC | ... |
---|---|---|---|---|---|---|---|---|---|---|
Anchor | FOXA2 | 0.0 | 97.4 | 92.4 | 9.8 | 35.4 | n/a | n/a | n/a | |
Partner | HNF1B | 76.6 | 53.2 | 64.3 | 4.3 | 35.3 | 0 | 0 | 0 |
The second block of data consequently for five computation flows represents significances for CEs with more conserved Anchor and Partner motifs.
Example:
# Motif | Motif Name | ... | Full overlap, Conservative Anchor, -Log10[P-value] | Full overlap, Conservative Partner, -Log10[P-value] | Partial overlap, Conservative Anchor, -Log10[P-value] | Partial overlap, Conservative Partner, -Log10[P-value] | ... |
---|---|---|---|---|---|---|---|
Anchor | FOXA2 | n/a | n/a | n/a | n/a | ||
Partner | HNF1B | 50.8 | 0.6 | 40.1 | 19.2 |
The third block of data consequently for five computation flows represents five significances of asymmetry (p-value) in CEs: ‘Anchor vs. Partner’. The positive/negative values reflect the enrichment toward the Anchor/Partner motifs.
Example:
# Motif | Motif Name | ... | Full overlap, Asymmetry to Anchor+/Partner-, -Log10[P-value] | Partial overlap, Asymmetry to Anchor+/Partner-, -Log10[P-value] | Overlap, Asymmetry to Anchor+/Partner-, -Log10[P-value] | ... |
---|---|---|---|---|---|---|
Anchor | FOXA2 | n/a | n/a | n/a | ||
Partner | HNF1B | -46.9 | -73.1 | -113.1 |
The final fourth block shows recommended Bonferroni’s correction thresholds (-Log10[P-value]) for significances:
# Motif | ... | Bonferroni_CE | Bonferroni_CE(AncPar) | Bonferroni_Asym |
---|---|---|---|---|
Anchor | 11.8 | |||
Partner | 11.71 | 10.61 | 10.31 |
Example:
1F | 0F | 11P | 10P | 9P | 8P | 7P | 6P | 5P | 4P | 3P | 2P | 1P | 0S | 1S | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Everted | 2.54 | 3.88 | 0.16 | 0.94 | 0.07 | 0.18 | 0.76 | 0.09 | |||||||
Inverted | 0.07 | 2.51 | 0.07 | 1.08 | 0.04 | 0.13 | 0.72 | 0.16 | 0.18 | 0.31 | 0.67 | ||||
DirectPA | 0.04 | 1.86 | 0.09 | 0.45 | 0.02 | 0.11 | 0.13 | 0.09 | |||||||
DirectAP | 0.04 | 0.18 | 0.07 | 0.16 | 0.11 | 0.02 | 0.18 | 0.11 | 0.11 | 0.2 | 0.22 | ||||
Any | 6.84 | 6.84 | 0.00 | 4.77 | 10.23 | 3.10 | 0.29 | 0.75 | 3.79 | 2.24 | 0.92 | 1.49 | 3.51 | 2.70 | |
Cumulative | 6.84 | 9.60 | 9.60 | 14.02 | 19.43 | 19.83 | 20.06 | 20.34 | 20.92 | 21.21 | 21.95 | 22.87 | 23.79 | 24.66 |
This example shows distribution for heterotypic CE, since notations DirectAP / DirectPA imply Anchor-Partner / Partner-Anchor cases. A homotypic CE has only one direct orientation of CE, Anchor-Anchor (notation DirectAA). 'Any' row implies frequency of CEs with a certain spacer or an overlap for all four orientations. 'Cumulative' row means the sum of frequencies for all mutual locations which respect to the current and closer positioning of motifs, i.e. the sum of values of 'All' row from the most left column to the current one.
Example below shows FOXA2 (Anchor) and HNF1B (Partner) motifs for Overlap computation flow.
Anchor Thr | Partner Thr | Real CE+ | Real Total | Rand CE+ | Rand Total | Fold | P-value | |
---|---|---|---|---|---|---|---|---|
A 1 | P 1 | 13 | 211 | 20 | 14137 | 43.550 | 4.59308e-16 | |
A 1 | P 2 | 10 | 167 | 12 | 11189 | 55.833 | 2.00465e-13 | |
A 1 | P 3 | 9 | 142 | 18 | 9514 | 33.500 | 9.33201e-11 | |
A 1 | P 4 | 22 | 312 | 27 | 20904 | 54.593 | 7.98321e-28 | |
A 1 | P 5 | 32 | 359 | 38 | 24053 | 56.421 | 2.97654e-40 | |
A 2 | P 1 | 10 | 125 | 18 | 8375 | 37.222 | 3.45082e-12 | |
A 2 | P 2 | 13 | 78 | 8 | 5226 | 108.875 | 9.83765e-20 | |
A 2 | P 3 | 10 | 91 | 5 | 6097 | 134.000 | 8.07117e-16 | |
A 2 | P 4 | 21 | 172 | 23 | 11524 | 61.174 | 1.41932e-27 | |
A 2 | P 5 | 23 | 191 | 25 | 12797 | 61.640 | 4.1353e-30 | |
A 3 | P 1 | 7 | 160 | 18 | 10720 | 26.056 | 5.01905e-08 | |
A 3 | P 2 | 11 | 106 | 17 | 7102 | 43.353 | 7.15405e-14 | |
A 3 | P 3 | 15 | 107 | 18 | 7169 | 55.833 | 9.86635e-20 | |
A 3 | P 4 | 18 | 229 | 37 | 15343 | 32.595 | 4.69822e-20 | |
A 3 | P 5 | 23 | 263 | 32 | 17621 | 48.156 | 3.30391e-28 | |
A 4 | P 1 | 11 | 192 | 35 | 12864 | 21.057 | 4.44823e-11 | |
A 4 | P 2 | 11 | 133 | 13 | 8911 | 56.692 | 9.7153e-15 | |
A 4 | P 3 | 24 | 124 | 29 | 8308 | 55.448 | 5.57973e-31 | |
A 4 | P 4 | 36 | 303 | 58 | 20301 | 41.586 | 7.52749e-42 | |
A 4 | P 5 | 37 | 286 | 39 | 19162 | 63.564 | 5.87211e-48 | |
A 5 | P 1 | 17 | 171 | 13 | 11457 | 87.615 | 3.18051e-24 | |
A 5 | P 2 | 20 | 124 | 15 | 8308 | 89.333 | 1.23212e-28 | |
A 5 | P 3 | 80 | 151 | 59 | 10117 | 90.847 | 1.07118e-118 | |
A 5 | P 4 | 55 | 229 | 37 | 15343 | 99.595 | 7.19329e-79 | |
A 5 | P 5 | 53 | 250 | 33 | 16750 | 107.606 | 9.56812e-77 | |
Anchor | 170 | 1392 | 237 | 93264 | 48.059 | 1.76431e-199 | ||
Partner | 95 | 1057 | 106 | 70819 | 60.047 | 3.61688e-118 | ||
Asymmetry | 399 | 1741 | 598 | 116647 | 44.704 | 1e-300 | ||
Symmetry | 0 | 0 | 0 | 0 | 1.000 | 1 | ||
Anchor_Partner | 280 | 1418 | 335 | 904 | 0.533 | 7.08806e-20 | ||
Asym_Sym | 1418 | 1418 | 904 | 904 | 1.000 | 2 |
#Seq | Anchor start | Anchor end | Partner start | Partner end | Mutual Loc | Loc Type | Strands | Mutual Ori | Anchor hit conservation, -Log10(ERR) | Partner hit conservation, -Log10(ERR) | Anchor seq | Partner seq |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Seq 18 | 271 | 282 | 268 | 282 | 0F | Full | -+ | Inverted | 3.365 | 3.447 | tgtttatctttc | agtgaaagataaaca |
Seq 23 | 565 | 576 | 569 | 583 | 8P | Partial | -+ | Everted | 3.457 | 4.506 | tattgacttacc | agtcaataagttaca |
Seq 33 | 145 | 156 | 179 | 193 | 22S | Spacer | ++ | DirectAP | 3.448 | 4.289 | tgttgacagact | ggttaatgctttcct |
${1000*OBS_{i,j}/N(OBS)}
$ and ${1000*EXP_{i,j}/N(EXP)}
$,
where $N(OBS)
$ and $N(EXP)
$ are total counts of predicted CEs in observed and expected lists. The output heatmap shows the difference between relative frequencies of observed and expected CEs. Example below shows the asymmetry toward the Partner (HNF1B) motif in Overlap computation flow.
<3.5 | 3.5..3.7 | 3.7..3.9 | 3.9..4.1 | 4.1..4.3 | 4.3..4.5 | 4.5..4.7 | 4.7..4.9 | 4.9..5.1 | 5.1..5.3 | 5.3..5.5 | >5.5 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
<3.5 | 14 | 14 | -8 | -14 | -26 | -12 | 1 | |||||
3.5..3.7 | 7 | -15 | -7 | -7 | -2 | |||||||
3.7..3.9 | 37 | -4 | -2 | -6 | 4 | -11 | -3 | -3 | ||||
3.9..4.1 | 17 | -1 | 14 | 1 | -4 | -9 | ||||||
4.1..4.3 | 39 | -2 | 8 | 5 | -8 | -2 | ||||||
4.3..4.5 | 3 | -6 | 4 | 5 | 5 | |||||||
4.5..4.7 | -10 | 22 | -5 | |||||||||
4.7..4.9 | 13 | -4 | -3 | |||||||||
4.9..5.1 | -23 | -3 | 5 | -4 | 1 | |||||||
5.1..5.3 | 1 | -2 | ||||||||||
5.3..5.5 | -4 | -3 | -2 | -3 | ||||||||
>5.5 | 4 | 2 | -1 | -2 | -3 |
The same calculations are performed for Anchor-Anchor CEs, in this case the enrichment of symmetrical vs. asymmetrical CEs are tested.