sharkdp / pastel

A command-line tool to generate, analyze, convert and manipulate colors
Apache License 2.0
5.04k stars 97 forks source link

Improve and optimize "pastel distinct" #67

Open sharkdp opened 5 years ago

sharkdp commented 5 years ago

optimize:

sharkdp commented 5 years ago

@d-dorazio has implemented the major part of this here: #83

sharkdp commented 5 years ago

I have added a hidden --print-minimal-distance option for debugging/inspection.

To gain some more insights, I wrote a small Python script which runs pastel distinct N (N = 2…13) many times to collect some statistics about the minimal distance.

Here is the current version:

[002] max: 258.693, min: 258.675 (100.0%), mean: 258.693 (100.0%)
[003] max: 171.247, min: 171.199 (100.0%), mean: 171.244 (100.0%)
[004] max: 134.193, min: 134.139 (100.0%), mean: 134.180 (100.0%)
[005] max: 111.769, min: 107.472 ( 96.2%), mean: 109.117 ( 97.6%)
[006] max: 103.226, min: 103.180 (100.0%), mean: 103.218 (100.0%)
[007] max:  93.686, min:  90.554 ( 96.7%), mean:  92.303 ( 98.5%)
[008] max:  87.813, min:  85.984 ( 97.9%), mean:  87.488 ( 99.6%)
[009] max:  83.054, min:  80.084 ( 96.4%), mean:  82.133 ( 98.9%)
[010] max:  79.765, min:  79.651 ( 99.9%), mean:  79.721 ( 99.9%)
[011] max:  74.069, min:  71.182 ( 96.1%), mean:  73.915 ( 99.8%)
[012] max:  70.588, min:  67.531 ( 95.7%), mean:  69.774 ( 98.8%)
[013] max:  66.987, min:  64.273 ( 95.9%), mean:  66.091 ( 98.7%)

max shows the best achieved score (i.e. the maximum of all maximized minimal distances). min shows the worst achieved score. mean is the average across all runs. Interestingly, there are some low-N cases like N=5 or N=7 which seem particularly hard (low mean and min values). As far as I can tell, the algorithm seems to get stuck in a local minimum (it looks like this in the --verbose output).

Example (N=5): This solution is close to the optimum (rearranged): #ff0800, #0400ff, #230025, #ffff00, #00ffff (D_min = 111.77) But this solution comes up often (rearranged): #ff0000, #0000ff, #000030, #ffffe2, #00ff00 (D_min = 107.53)

Unfortunately, when I increase the number of iterations (and decrease the cooling rate), the "mean" values improve (which is expected), but similar "min" values still appear:

[002] max: 258.693, min: 258.675 (100.0%), mean: 258.693 (100.0%)
[003] max: 171.247, min: 171.247 (100.0%), mean: 171.247 (100.0%)
[004] max: 134.193, min: 134.153 (100.0%), mean: 134.181 (100.0%)
[005] max: 111.769, min: 107.473 ( 96.2%), mean: 110.262 ( 98.7%)
[006] max: 103.226, min: 103.119 ( 99.9%), mean: 103.215 (100.0%)
[007] max:  93.686, min:  90.576 ( 96.7%), mean:  92.873 ( 99.1%)
[008] max:  87.813, min:  85.743 ( 97.6%), mean:  87.421 ( 99.6%)
[009] max:  83.054, min:  80.857 ( 97.4%), mean:  82.322 ( 99.1%)
[010] max:  79.765, min:  77.684 ( 97.4%), mean:  79.696 ( 99.9%)
[011] max:  74.069, min:  71.165 ( 96.1%), mean:  73.931 ( 99.8%)
[012] max:  70.588, min:  67.757 ( 96.0%), mean:  70.145 ( 99.4%)
[013] max:  67.004, min:  63.047 ( 94.1%), mean:  66.523 ( 99.3%)

This means that the users can still have "unlucky" runs.

The best thing I can come up with right now is to perform multiple runs with random initial starting conditions (starting colors). Then, we would simply pick the best run afterwards. This is also something that could easily be parallelized.

@d-dorazio FYI

danieledapo commented 5 years ago

I think that running the simulated annealing multiple times is fine and probably the best way to not get stuck in a local minimum. Also, this is an embarrassingly parallel thing to do which is quite nice.

I'm not too sure whether it makes sense to run the process on a completely different set of starting colors rather than re-run the process on the same starting ones. That's because I think is less likely to get stuck in a local minimum in the same set of inputs twice vs different ones, but I'm not too sure, it's just a gut-feeling.

sharkdp commented 5 years ago

That's because I think is less likely to get stuck in a local minimum in the same set of inputs twice vs different ones

I would think that the exact opposite would be the case. Certainly, if the initial temperature is high enough, it shouldn't really matter.

danieledapo commented 4 years ago
  • [ ] support something like pastel distinct 8 red blue yellow (fix three colors, generate five new)

I guess this is done given that https://github.com/sharkdp/pastel/pull/88 was merged?