ag-csw / LDStreamHMMLearn

1 stars 0 forks source link

Heatmap Averaging #8

Closed alexlafleur closed 7 years ago

alexlafleur commented 7 years ago

Run the overall heatmap creation routine 8 times and average the errors and performances. Then, create the plots with the averaged values.

alexlafleur commented 7 years ago

Averaged Plots:

performance

error

greenTara commented 7 years ago

I see two surprising things:

  1. there appears to be no significant difference in performance between the Naive and Bayesian algorithms. Could you please show the numerical values for the plots to see if this is actually the case?
  2. For fixed taumeta, the error appears to be smallest for the same parameter choice that minimizes the performance. That seems too good to be true - normally these quantities would vary in opposite directions and one would have to make a tradeoff.
alexlafleur commented 7 years ago

Eta performance naive:

[[ -5.67276586 -6.65562545 -7.68345096] [ -6.6825258 -7.69764958 -8.90033415] [ -7.71699421 -8.90548707 -11.56691901]]

Eta performance bayes:

[[ -5.76415246 -6.81918831 -7.96662801] [ -6.8397379 -7.97088279 -9.37045016] [ -7.98268117 -9.37863581 -12.09885512]]

Eta error naive:

[[-3.19964899 -4.00276063 -4.95918649] [-3.68297639 -4.68238457 -5.4682157 ] [-4.19152958 -5.08508432 -5.94330871]]

Eta error bayes:

[[-3.66855202 -4.46536667 -5.3170029 ] [-4.16625961 -5.06696069 -5.63294027] [-4.57616182 -5.39716066 -5.99084621]]

Scale_window performance naive:

[[ -6.65727742 -7.63530568 -8.65848799] [ -6.66343597 -7.70674388 -8.90521822] [ -6.74791397 -7.9743538 -11.00771412]]

Scale_window performance bayes:

[[ -6.74502218 -7.775935 -8.90656262] [ -6.82295535 -7.97742595 -9.38829412] [ -7.03949023 -8.47686643 -11.67118212]]

Scale_window error naive:

[[-3.18580651 -4.11089865 -5.02618779] [-3.7177787 -4.54509814 -5.61976812] [-4.21837245 -5.252896 -6.40116568]]

Scale_window error bayes:

[[-3.66800127 -4.62003199 -5.42340113] [-4.1173274 -4.88579641 -6.01140393] [-4.63576684 -5.59314791 -6.39406917]]

Num_trajectories performance naive:

[[-4.70122135 -5.62238885 -6.58985166] [-5.61857081 -6.55881553 -7.49356709] [-6.53263813 -7.46476206 -8.381951 ]]

Num_trajectories performance bayes:

[[-4.73996472 -5.71876718 -6.73731855] [-5.70681172 -6.69181765 -7.74223677] [-6.66076056 -7.69624795 -8.79946102]]

Num_trajectories error naive:

[[-2.59853438 -3.50332731 -4.4241026 ] [-3.15195022 -4.054486 -5.00327286] [-3.73242843 -4.58349257 -5.61644714]]

Num_trajectories error bayes:

[[-3.13727273 -4.02582357 -4.9991022 ] [-3.64323813 -4.50004864 -5.47845725] [-4.19277102 -5.01440484 -6.06743011]]

alexlafleur commented 7 years ago

Yes... I remember that it was like that in the beginning.. Some plots within issue #3 also show that difference in performance and error. Strange, we must have changed something so that the values are returned in wrong order? Last week we changed the matrix indexing (the order of "one" and "two" variables). I'm taking a look into that until our talk later.

alexlafleur commented 7 years ago

Is it possible that the problem comes from the adjustment we made for the time calculation? We realized that the time was added up in each run, and we just took the last value from the array of times (thus the time of all runs together?)... For the error we take the average error.

I think it should be the slope (of the added up times which then roughly form a straight) or the average time of each individual run that we plot.

What do you think?

alexlafleur commented 7 years ago

As an update: The performance and error plots with taking the time slope: performance error

greenTara commented 7 years ago

I want to compare the cumulative time for calculation over the complete dataset. For the naive algorithm, this should go roughly like numsteps_window_size, since there a numsteps estimations and each one is linear in the window size. The Bayes calculation should go roughly like window_size + numsteps_slide (or whatever we renamed it to), where in general slide is much smaller than window size. So when numsteps is more than 1 and slide is less than window_size, there should be a reduction in the performance time for Bayes relative to Naive.

On Thu, Oct 13, 2016 at 6:16 AM, Alexandra La Fleur < notifications@github.com> wrote:

Is it possible that the problem comes from the adjustment we made for the time calculation? We realized that the time was added up in each run, and we just took the last value from the array of times (thus the time of all runs together?)... For the error we take the average error.

I think it should be the slope (of the added up times which then roughly form a straight) or the average time of each individual run that we plot.

What do you think?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ag-csw/LDStreamHMMLearn/issues/8#issuecomment-253473552, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMMVXWd4uEjakv9Cn07KcVYFf8uRm7oks5qzgTpgaJpZM4KRIRm .

greenTara commented 7 years ago

superceded by #22