numenta / nupic.core-legacy

Implementation of core NuPIC algorithms in C++ (under construction)
http://numenta.org
GNU Affero General Public License v3.0
272 stars 278 forks source link

Handle RNG seeds consistently #649

Open scottpurdy opened 9 years ago

scottpurdy commented 9 years ago

Currently some algorithms default to seed=-1, which will use a random seed each time, while others default to a fixed seed. We should handle this consistently.

This issue can track everyone's opinion on it. @subutai @chetan51 @breznak

Some current fixed seed defaults:

Some current random seed defaults:

breznak commented 9 years ago

Thanks, on a quick thought..

FIXED rng seed:

In conclusion, I'd go with a fixed seed (for testing, 99% users), and in special case one can always set to random seed -1

subutai commented 9 years ago

@breznak Thanks for the great list. I agree, for the same reasons you stated. You can always use -1 if you need non-determinism for some reason.

scottpurdy commented 9 years ago

To play Devil's advocate - we almost always specify the seeds explicitly when instantiating the algorithms so we would still have deterministic tests, ability to spot platform differences, etc.

One advantage of having the default be the special value "-1" is that we can introduce a function to change the default across the board. By default, "-1" would choose a random seed but you could call Random.setDefaultSeed(42) and any time an algorithm instantiates Random(-1) after that it uses the seed "42" rather than a random seed.

Again, just playing devil's advocate.

scottpurdy commented 9 years ago

@numenta/nupic-reviewers - please chime in if you have an opinion

oxtopus commented 9 years ago

:+1: for fixed default seeds

chetan51 commented 9 years ago

I agree with @breznak. Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

vitaly-krugl commented 9 years ago

+1 on DRY

breznak commented 9 years ago

Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

Can we then introduce a (header) file with nupic CONSTANTS? (not sure if it makes sense at a global level though..)

breznak commented 9 years ago

Can I take this as an agreement on fixed default rng seed?

Also, on the

Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

this goes back to my "interface" proposals. We could have a class "Block" (can't come up with a better name, just something with columns); the Block would have param seed, TP,SP,TM inherit from Block and pass kwargs; like that? But let's keep that for other PR, here I just want same params for SP implementations, this 'rng seed' is just a blocker, so I'd manually change that on the 3 places..

scottpurdy commented 9 years ago

Based on the comments here I propose that we set the seed keyword-argument default value to 0 everywhere in both C++ and Python. Please follow up soon if you disagree with this conclusion.

There was some idea of defining this in a variable somewhere but using that variable consistently is not any more DRY than using 0 consistently. Not to mention that the seed has no meaningful value across algorithm implementations. So to avoid a bunch of unnecessary imports (and thus code dependencies) I am excluding that from the proposal. I like the idea of using interfaces where they benefit but objects that have seeds don't actually share any functions. A lot of the time they take the seed as a constructor parameter but that isn't something you can enforce with a parent class.

breznak commented 8 years ago

@scottpurdy why was this issue closed? The agreement seemed to be on fixed rng seed, but no PR has resolved the matter as I've checked now.

rhyolight commented 8 years ago

@scottpurdy Please respond to Mark above, because there is a chain of issues and PRs that are blocked by this.

scottpurdy commented 8 years ago

I must have been thinking of this issue as a proposal as opposed to implementation issue. Fine to track the implementation here though since it has all the context.

breznak commented 8 years ago

I'll try to resolve this soon. So in https://github.com/numenta/nupic.core/issues/649#issuecomment-152034312 we agreed on default fixed seed (with value 0), -1 would mean random. I'll make a PR to C++, python would follow.

scottpurdy commented 8 years ago

@breznak pointed out an issue with the proposed solution to random seeds here. Here are a few options:

  1. Change the Random class to treat 0 as a fixed seed and have a different value (accessed via final static variable, probably set to the max UInt value) that results in a random seed. This way we can stick with current plan without changing the signature of Random.
  2. Change the Random class to take an signed int for the seed (it currently takes an unsigned int) and change the default, "random" value from 0 to -1. This way we can stick with plan but have to change the signature of Random.

There are other options as well but these seem to be the two that deviate the least. I think 1 makes a little more sense but would like to hear from @numenta/nupic-reviewers (in particular @subutai )

vitaly-krugl commented 8 years ago

Just to throw in my 2 pennies' worth, if the Random seeding API looks and sounds like the standard C++ or C counterpart, don't change the semantics to differ from the standard behavior. People don't expect that and will tend to use the API in the same way as the standard seeding API that they are familiar with. This always result in subtle, difficult to debug bugs.

scottpurdy commented 8 years ago

@vitaly-krugl does that mean you are in support of 1? The builtin implementation has a separate srand function for seeding that takes an unsigned int.

vitaly-krugl commented 8 years ago

@scottpurdy, I'm not sure yet. Would it be wrong to overload the method signature with an additional argument just for the purpose of signaling the default. This way, everyone who uses it in the standard way will get the expected behavior (no matter the value), while the users of the additional arg will make their intention explicit, too.

rhyolight commented 8 years ago

@scottpurdy @vitaly-krugl Seems like this discussion needs to continue?

vitaly-krugl commented 8 years ago

@rhyolight: need to revisit when @scottpurdy returns.