piantado / OneParameterIsAlwaysEnough

code from Piantadosi (2018)
11 stars 4 forks source link

Result not as expected #1

Open rwesson opened 5 years ago

rwesson commented 5 years ago

Having read the really interesting paper describing this work, I wanted to try out the code. I cloned the repository and ran the code without alteration, thus fitting the Miro signature. It took 27 hours to run, and the result did not match the paper's Figure 1 (see attachment). The parameter calculated was 0.0096826... where the paper gives 0.00242654...

So, I am just wondering if the run time is what you would expect, and if you have any idea what might be causing this failure. oneparamfailure

piantado commented 5 years ago

Sorry for the delay, that looks like an issue with the precision in mpmath, can you confirm what version of mpmath you're using? And which python?

rwesson commented 5 years ago

I ran it with python 2.7.12, and mpmath 1.0.0. I have also run it on a resized version of the image (300x177 pixels). It took three hours to run on that, but all y points come out with just the value of the coefficient.

gitkol commented 3 years ago

Hi, @piantado , I hope you are still monitoring this thread. I recently read your fascinating article and experienced the exact same problems that @rwesson mentioned. I looked at the code and found multiple algorithmic and performance issues that caused the problems. I took the liberty to fix the program and attached it here. I added comments to explain it, but briefly here are the main points.

  1. The program ran excessively slowly (my first attempt on miro took also 27 hours) mainly because of an indenting error. Choosing the y value from the on list should be outside the for py loop. The program was processing almost every single pixel even though the vast majority of the pixels were empty.
  2. The phiinv() function should not use mpmath! First, it is exclusively applied to float values so there is really no need for mpmath and normal math is much faster, but more importantly float2binary() does not work properly with mpf values. This is why Robert's results (and mine) were completely wrong. A simple cast to float in the return statement fixes this, but it is best to simply use math. Also note that even if float2binary() was altered to handle mpf values, why process 100s or 1000s of digits when we only need r=8 or so? The simple switch to returning float in phiinv() speeds up the calculation by order(s) of magnitude depending on how many points we want to fit.
  3. The logic in the original code can only handle a single y value per x pixel. IMHO, this is why elephant.png and miro.png both draw their patterns with extremely thick lines, so the algorithm can randomly pick a single y value from multiple values that are on in a single x column (yy = sample(on,1)[0]). This approach does not work with patterns plotted with thin lines. Moreover, this algorithm cannot handle empty columns and must add a fake point (it does it at 0.01). The key to fix this is to introduce a counter as an additional element in the points tuple and for computing 2^(rx) in the logistic map use this counter instead of px, which is still used for plotting and, then, the theta function can handle multiple y values in a single column, plus there is no need to fudge empty columns.

I added a command line input with some options.

$ python plot.py -h
usage: plot.py [-h] imageFile [onCutoff] [Nx] [r] [sampleSize]

positional arguments:
  imageFile   image file
  onCutoff    cutoff to consider pixel lit, 0 (black) < cutoff < 1 (white), default = 0.1
  Nx          effective resolution in horizontal dimension, default = physical resolution
  r           2^{-r} is our precision in binary digits, default = 8
  sampleSize  size of random y samples in the vertical dimension for a given x, >= 0, default=0 means full sample

optional arguments:
  -h, --help  show this help message and exit

Here is the command lines for miro.png and I also attached the resulting plots. Miro took 4 seconds with sample=1, which corresponds to the original run, 27 seconds with sample=3, and 117 seconds with sample=3. Increasing the sample size gives much better quality fitted images but, of course, execution time grows very quickly, it scales O(r*n^2) where n is the total number of "on" y points and r is the number of binary digits. python plot.py miro.png 0.1 750 8 [1,3,5]

Hope you find this useful, I found it very exciting to see your article in action.

Best,

Istvan Kolossvary

plot.py.gz miro-fitted-sample=1 miro-fitted-sample=3 miro-fitted-sample=5

piantado commented 2 years ago

Ah sorry for such a long delay -- I just was responding to someone asking about this code and then saw that I never responded here! Thank you Istvan for the wonderful update. It's much appreciated!

gitkol commented 2 years ago

Hi Steve,

Thanks for your note, glad you and others can use it. :)

Best,

Istvan

On Wed, Oct 6, 2021 at 6:07 PM steve piantadosi @.***> wrote:

Ah sorry for such a long delay -- I just was responding to someone asking about this code and then saw that I never responded here! Thank you Istvan for the wonderful update. It's much appreciated!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/piantado/OneParameterIsAlwaysEnough/issues/1#issuecomment-937234701, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKDNJ2WQRDOTWTG5G47UDE3UFTCA7ANCNFSM4GASOFJQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.