jesg / combinatoricslib

Automatically exported from code.google.com/p/combinatoricslib
0 stars 0 forks source link

Need method to randomly sample SimpleCombination #6

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I would email but don't know your email address.

I am the main developer for MDR - Multifactor Dimensionality Reduction which is 
a fairly simple classifier used in bioinformatics. See 
http://sourceforge.net/projects/mdr/

Normally, users do an exhaustive search over attribute combinations but we also 
have a random timed mode. Currently I make no effort from to prevent the random 
from testing the same combination more than once. I would like a tool that 
would allow me to get combinations without replacement. The issue is that it 
must be extremely efficient and fast -- it is not worthwhile to just maintain a 
list of tested combinations. It would be okay and perhaps even preferable if 
the progression were 'psuedo-random' in such a way that all attributes were 
tested fairly equally.

Another related, but harder problem, is using evolutionary algorithms to test 
combinations. I have found that this tends to test the same combinations many 
many times, since I already use elitism so I don't lose track of winners, this 
is a big waste of processing time. If I had a good way to know what has been 
sampled previously I could prevent waste -- this would also act as a form of 
'novelty' seeking which has been shown in some evolutionary algorithm contexts 
to be helpful.

Thanks,

Peter Andrews
Norich, Vermont USA

Original issue reported on code.google.com by PeterVermont on 21 Dec 2012 at 4:00

GoogleCodeExporter commented 9 years ago

Original comment by d.pau...@gmail.com on 31 Jan 2013 at 1:49

GoogleCodeExporter commented 9 years ago
You can randomly select which results from the Generator to return.  The 
Generator is an Iterable, so you could wrap it and return a RandomIterator 
which randomly returns results from the underlying Iterator.  Get as fancy as 
you want in the RandomIterator.

The weakness here is that you are still dependent upon the underlying sequence 
of the Generator, which is not necessarily random between runs.  The early 
results are more probable than the later results.

Original comment by ryan.gus...@gmail.com on 14 Mar 2014 at 4:39