MIXED_XXXX abstraction head to head win averages are in unexpected order

I ran some tests of holdem, nolimit, 2player, 1|2 small blind |bigblind, 200|200 stack, maxRaises of 3 4 4 4 games. During cluster abstractions runs for all tests I kept the nb-samples to "0,2,500,500", the buckets to : 169,5,10,500, the error bounds to : .01,.01,.01,.01, the nb-hist-samples-per-round to 0,1,200,200. For all tests I held the action-abstraction to polrelative at 0.4,0.8,1.2,2,5,9999 raises. For cfr learning I had 12 threads and times of 8 hours and sometimes 16 hrs and 24 hrs.

I ran the head to heads, specifically NSSS against each of the NOOO, NEES, NEEO. I expected NSSS to perform the wost, meaning lose money, i.e. negative average wins and NEEO to be best. I'm getting NSSS to be the best ! Here's a table of results. As you can see I ran cfr's learning phase for the most sophisticated strategy, NEEO, for longer and longer times, so 8 hrs then 16 hours then 24 hours but that didn't change things. Any ideas of what to experiment with to get the results to align with expectations - meaning NEEO, NEES, NOOO to be all better than NSSS. Update: thinking harder, I'm wondering if the clustering abstraction is too coarse so I need to increase the fineness, by increasing the nb-samples and the nb-hist-samples. Any ideas on combinatrics around this to see what's appropriate ?

Abs	cfrm runtime (secs / 1000)	Abs	cfrm runtime (secs / 1000)	Avg win	Var	num games	seed	median win
NSSS	28	NOOO	28	2.94	7167	500000	7534	0
	28	NOOO	28	2.76	7119	100000	3575	0
	28	NOOO	28	2.95	7163	100000	8379	1.5

	28	NEES	28	3.4	7564	100000	8379	1.5
	28	NEES	28	3.03	3575	100000	3575	1.5

	28	NEEO	28	5.07	7475	100000	8379	1.5
	28	NEEO	57	4.53	7118	100000	7534	1.5
	28	NEEO	86	5.05	7141	100000	7534	1.5
	28	NEEO	86	4.84	7138	100000	8370	1.5

pandaant / poker-cfrm

MIXED_XXXX abstraction head to head win averages are in unexpected order #4