Closed argiopetech closed 11 years ago
good thought. Would there still be a way to get the same result twice if one wanted to? This could be useful for testing purposes.
On 7/23/13 2:00 AM, Elliot Robinson wrote:
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
The proposed method should still give deterministic results as long as we don't change hash functions or change the number of warmup iterations. I intend these both to be hard-coded. On Jul 23, 2013 9:28 AM, "tedvh" notifications@github.com wrote:
good thought. Would there still be a way to get the same result twice if one wanted to? This could be useful for testing purposes.
On 7/23/13 2:00 AM, Elliot Robinson wrote:
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHubhttps://github.com/argiopetech/base/issues/38#issuecomment-21413135 .
OK. And is there a way to start a run off differently in case one wants to do that?
On 7/23/13 11:25 AM, Elliot Robinson wrote:
The proposed method should still give deterministic results as long as we don't change hash functions or change the number of warmup iterations. I intend these both to be hard-coded. On Jul 23, 2013 9:28 AM, "tedvh" notifications@github.com wrote:
good thought. Would there still be a way to get the same result twice if one wanted to? This could be useful for testing purposes.
On 7/23/13 2:00 AM, Elliot Robinson wrote:
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHubhttps://github.com/argiopetech/base/issues/38#issuecomment-21413135 .
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38#issuecomment-21422050.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
The current --seed CLI flag and the seed: YAML field will remain as they are, they'll just be mapped to a (hopefully) more complex number internally.
Elliot Robinson Email: elliot.robinson@argiopetech.com Phone: (321) 252-9660
On Tue, Jul 23, 2013 at 11:52 AM, tedvh notifications@github.com wrote:
OK. And is there a way to start a run off differently in case one wants to do that?
On 7/23/13 11:25 AM, Elliot Robinson wrote:
The proposed method should still give deterministic results as long as we don't change hash functions or change the number of warmup iterations. I intend these both to be hard-coded. On Jul 23, 2013 9:28 AM, "tedvh" notifications@github.com wrote:
good thought. Would there still be a way to get the same result twice if one wanted to? This could be useful for testing purposes.
On 7/23/13 2:00 AM, Elliot Robinson wrote:
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHub< https://github.com/argiopetech/base/issues/38#issuecomment-21413135>
.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38#issuecomment-21422050.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHubhttps://github.com/argiopetech/base/issues/38#issuecomment-21424164 .
ah, gotcha.
On 7/23/13 12:10 PM, Elliot Robinson wrote:
The current --seed CLI flag and the seed: YAML field will remain as they are, they'll just be mapped to a (hopefully) more complex number internally.
Elliot Robinson Email: elliot.robinson@argiopetech.com Phone: (321) 252-9660
On Tue, Jul 23, 2013 at 11:52 AM, tedvh notifications@github.com wrote:
OK. And is there a way to start a run off differently in case one wants to do that?
On 7/23/13 11:25 AM, Elliot Robinson wrote:
The proposed method should still give deterministic results as long as we don't change hash functions or change the number of warmup iterations. I intend these both to be hard-coded. On Jul 23, 2013 9:28 AM, "tedvh" notifications@github.com wrote:
good thought. Would there still be a way to get the same result twice if one wanted to? This could be useful for testing purposes.
On 7/23/13 2:00 AM, Elliot Robinson wrote:
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHub< https://github.com/argiopetech/base/issues/38#issuecomment-21413135>
.
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38#issuecomment-21422050.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
— Reply to this email directly or view it on GitHubhttps://github.com/argiopetech/base/issues/38#issuecomment-21424164 .
— Reply to this email directly or view it on GitHub https://github.com/argiopetech/base/issues/38#issuecomment-21425506.
Ted von Hippel
Department of Physical Sciences Embry-Riddle Aeronautical University 600 S. Clyde Morris Boulevard Daytona Beach, FL 32114-3900 386-226-7751
Per Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications, the Mersenne Twister's randomness properties can suffer if seeded with a simple seed (from the binary viewpoint, the more zeroes there are starting from the MSB moving toward the LSB, the simpler the value).
Proposed remedy
Instead of using user-provided seeds directly, hash values prior to seeding. Additionally, generate and discard between several hundred and several thousand values to "warm up" the PRNG.