fozziethebeat / S-Space

The S-Space repsitory, from the AIrhead-Research group
GNU General Public License v2.0
206 stars 106 forks source link

Fixing Random Seeds for RandomIndexing #54

Open igorbrigadir opened 10 years ago

igorbrigadir commented 10 years ago

I've been having some trouble fixing the Random seeds used in Random Indexing.

I'd like to have predictable output across runs, so I can run through a bunch of fixed seeds and see how much of an impact the random initialization & other parameters can have on retrieval.

The example:

RandomIndexing ri = new RandomIndexing(new Properties());
ri.RANDOM.setSeed(SEED);

Doesn't give me predictable output, as the RandomIndexVectorGenerator class has a random number source I can't fix for testing.

One way to do this would be to make the random seed an optional property - same as vectorLength.

(IncrementalSemanticAnalysis also uses some of the classes for Random Indexing, that might need the same change, I'll make a pull request with this when I'm done.)

davidjurgens commented 10 years ago

Hi,

I just pushed a new patch to RandomIndexing that should allow you to properly set the seed value. Hopefully this will fix your problem. RandomIndexing now has a proper constructor that shows all the properties that can be set, rather that having to specify them with a Properties object.

Thanks, David

On Thu, Jul 3, 2014 at 1:33 PM, igorbrigadir notifications@github.com wrote:

I've been having some trouble fixing the Random seeds used in Random Indexing.

I'd like to have predictable output across runs, so I can run through a bunch of fixed seeds and see how much of an impact the random initialization & other parameters can have on retrieval.

The example:

RandomIndexing ri = new RandomIndexing(new Properties()); ri.RANDOM.setSeed(SEED);

Doesn't give me predictable output, as the RandomIndexVectorGenerator class has a random number source I can't fix for testing.

One way to do this would be to make the random seed an optional property - same as vectorLength.

(IncrementalSemanticAnalysis also uses some of the classes for Random Indexing, that might need the same change, I'll make a pull request with this when I'm done.)

— Reply to this email directly or view it on GitHub https://github.com/fozziethebeat/S-Space/issues/54.

igorbrigadir commented 10 years ago

That's great, I'll try it out now, Thanks!