brettc / partitionfinder

PartitionFinder discovers optimal partitioning schemes for DNA sequences.
Other
61 stars 44 forks source link

Cache site rates? #35

Closed pbfrandsen closed 8 years ago

pbfrandsen commented 9 years ago

Right now, the site rates aren't cached or saved. We should probably cache them in case the run is interrupted and the user has to restart. Note that this might not always allow the user to 'skip' to the exact place that the algorithm left off since the k-means algorithm isn't guaranteed to converge on the same solution, even given identical rates.

cmayer commented 9 years ago

Hi all,

as Rob noted a seed value for the random number generator would make a run reproducible.

The site rates partitioning could always continue with the last partitioning scheme that was accepted. Suggestion: If the user asks to continue a run, the old seed, which has to be stored in the cash should be used to continue.

Best Christoph

Christoph Mayer Forschungsmuseum Alexander Koenig Bonn Email c.mayer.zfmk@uni-bonn.de Tel.: 0228 9122403

On 02.02.2015, at 19:51, Paul Frandsen notifications@github.com wrote:

Right now, the site rates aren't cached or saved. We should probably cache them in case the run is interrupted and the user has to restart. Note that this might not always allow the user to 'skip' to the exact place that the algorithm left off since the k-means algorithm isn't guaranteed to converge on the same solution, even given identical rates.

— Reply to this email directly or view it on GitHub.

brettc commented 9 years ago

... so we need to add a test that ensures restarting delivers the same results as simply running the whole process through.

On Tue Feb 03 2015 at 3:35:58 AM cmayer notifications@github.com wrote:

Hi all,

as Rob noted a seed value for the random number generator would make a run reproducible.

The site rates partitioning could always continue with the last partitioning scheme that was accepted. Suggestion: If the user asks to continue a run, the old seed, which has to be stored in the cash should be used to continue.

Best Christoph

Christoph Mayer Forschungsmuseum Alexander Koenig Bonn Email c.mayer.zfmk@uni-bonn.de Tel.: 0228 9122403

On 02.02.2015, at 19:51, Paul Frandsen notifications@github.com wrote:

Right now, the site rates aren't cached or saved. We should probably cache them in case the run is interrupted and the user has to restart. Note that this might not always allow the user to 'skip' to the exact place that the algorithm left off since the k-means algorithm isn't guaranteed to converge on the same solution, even given identical rates.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub https://github.com/brettc/partitionfinder/issues/35#issuecomment-72628671 .

roblanf commented 9 years ago

and perhaps make this test single-threaded for now (see --seed issue, where this is discussed)

On 4 February 2015 at 09:28, Brett Calcott notifications@github.com wrote:

... so we need to add a test that ensures restarting delivers the same results as simply running the whole process through.

On Tue Feb 03 2015 at 3:35:58 AM cmayer notifications@github.com wrote:

Hi all,

as Rob noted a seed value for the random number generator would make a run reproducible.

The site rates partitioning could always continue with the last partitioning scheme that was accepted. Suggestion: If the user asks to continue a run, the old seed, which has to be stored in the cash should be used to continue.

Best Christoph

Christoph Mayer Forschungsmuseum Alexander Koenig Bonn Email c.mayer.zfmk@uni-bonn.de Tel.: 0228 9122403

On 02.02.2015, at 19:51, Paul Frandsen notifications@github.com wrote:

Right now, the site rates aren't cached or saved. We should probably cache them in case the run is interrupted and the user has to restart. Note that this might not always allow the user to 'skip' to the exact place that the algorithm left off since the k-means algorithm isn't guaranteed to converge on the same solution, even given identical rates.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub < https://github.com/brettc/partitionfinder/issues/35#issuecomment-72628671>

.

— Reply to this email directly or view it on GitHub https://github.com/brettc/partitionfinder/issues/35#issuecomment-72727836 .

Rob Lanfear School of Biological Sciences, Macquarie University, Sydney

phone: +61 (0)2 9850 8204

www.robertlanfear.com

cmayer commented 9 years ago

Hi all,

Do you know this.

Basically it would cure all problems we have with non-deterministic behaviour. :)

Best Chritoph

Am 03.02.2015 um 21:28 schrieb Brett Calcott:

... so we need to add a test that ensures restarting delivers the same results as simply running the whole process through.

On Tue Feb 03 2015 at 3:35:58 AM cmayer notifications@github.com wrote:

Hi all,

as Rob noted a seed value for the random number generator would make a run reproducible.

The site rates partitioning could always continue with the last partitioning scheme that was accepted. Suggestion: If the user asks to continue a run, the old seed, which has to be stored in the cash should be used to continue.

Best Christoph

Christoph Mayer Forschungsmuseum Alexander Koenig Bonn Email c.mayer.zfmk@uni-bonn.de Tel.: 0228 9122403

On 02.02.2015, at 19:51, Paul Frandsen notifications@github.com wrote:

Right now, the site rates aren't cached or saved. We should probably cache them in case the run is interrupted and the user has to restart. Note that this might not always allow the user to 'skip' to the exact place that the algorithm left off since the k-means algorithm isn't guaranteed to converge on the same solution, even given identical rates.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub https://github.com/brettc/partitionfinder/issues/35#issuecomment-72628671 .

— Reply to this email directly or view it on GitHub.


Dr. Christoph Mayer Email: c.mayer.zfmk@uni-bonn.de Tel.: +49 (0)228 9122 403

Zoologisches Forschungsmuseum Alexander Koenig

Stiftung des öffentlichen Rechts; Direktor: Prof. J. W. Wägele Sitz: Bonn


roblanf commented 8 years ago

Closed, because we no longer use site rates. And entropy is so fast to calculate that there's practically nothing to be gained from caching.