colesbury / sudopy-python3

Peter Norvig's Sudoku solver (in Python 3)
1 stars 2 forks source link

speeding-up for one sudoku (the hard1) #1

Closed stonebig closed 6 months ago

stonebig commented 1 year ago

Hi,

Parallelising N sudoku is an obvious option for nogil.

But would it be possible to make one sudoku faster with an also simple change ?

I tried this, but my threading skill/muscle seems not to get it the right: I blew up my swap space in no time, needing to reboot.

def search(values):
    "Using depth-first search and propagation, try all possible values."
    if values is False:
        return False ## Failed earlier
    if all(len(values[s]) == 1 for s in squares):
        return values ## Solved!
    ## Chose the unfilled square s with the fewest possibilities

    n,s = min((len(values[s]), s) for s in squares if len(values[s]) > 1)
    ##before:
    #return some(search(assign(values.copy(), s, d))
    #            for d in values[s])

    ##after
    with ThreadPoolExecutor(max_workers=4) as e:
       l = e.map(search,(assign(values.copy(), s, d)
                for d in values[s]))
       return some(l)

trying with

solve_all([hard1], "easy", 0)
stonebig commented 1 year ago

on windows10, feeding the hardest.txt with 500 lines of "8..........36......7..9.2...5...7.......457.....1...3...1....68..85...1..9....4..":

stonebig commented 1 year ago

made a no-install version for the believers, auto-testing on 1, 2, 4, 8, 16 threads

hard2  = '8..........36......7..9.2...5...7.......457.....1...3...1....68..85...1..9....4..'    

if __name__ == '__main__':
    test()
    nbsudoku = 20
    thread_list = ( 1, 2, 4, 8, 16)
    for nbthreads in thread_list:
        startall = time.time()
        solve_all([hard2]*nbsudoku, "hard2", None, nbthreads=nbthreads) 
        print(f'solved {nbsudoku} tests with {nbthreads} threads in {time.time()-startall:.2f} seconds' + '\n')

On a personal test on a old intel cpu 2cpu/4threads Windows 10, with nogil python of 2022-12-22:

So on Windows, with nogil python of 2022-12-22:

stonebig commented 1 year ago

I see Peter Norvig did a Jupyterlab version, with more modern Python coding and Typing

I'll look if it changes things https://colab.research.google.com/github/norvig/pytudes/blob/main/ipynb/Sudoku.ipynb#scrollTo=W8iYmI_560L4

stonebig commented 1 year ago

On a windows 11 4/4 cpu, nogil looks 1.8x faster than standard

image

image

stonebig commented 1 year ago

did modified to get more easy to read results:

nogil

All tests pass.
On a 8 cpu 3.9.10 (heads/nogil:258cab1, Dec 22 2022, 04:07:29)
[nogil, [MSC v.1929 64 bit (AMD64)]]:
Solved 330 of 330 hardest puzzles (avg 0.01 secs (174 Hz), max 0.02 secs).
solved 330 tests with 1 threads in 1.97 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (146 Hz), max 0.02 secs).
solved 330 tests with 2 threads in 1.16 seconds, 1.70 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (113 Hz), max 0.03 secs).
solved 330 tests with 4 threads in 0.75 seconds, 2.62 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.02 secs (65 Hz), max 0.05 secs).
solved 330 tests with 8 threads in 0.66 seconds, 2.99 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.03 secs (34 Hz), max 0.09 secs).
solved 330 tests with 16 threads in 0.66 seconds, 2.99 speed-up 

normal python-3.9

All tests pass.
On a 8 cpu 3.9.8 (tags/v3.9.8:bb3fdcf, Nov  5 2021, 20:48:33) [MSC v.1929 64 bit (AMD64)]:
Solved 330 of 330 hardest puzzles (avg 0.00 secs (314 Hz), max 0.02 secs).
solved 330 tests with 1 threads in 1.16 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (147 Hz), max 0.09 secs).
solved 330 tests with 2 threads in 1.16 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (77 Hz), max 0.27 secs).
solved 330 tests with 4 threads in 1.16 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.02 secs (47 Hz), max 0.55 secs).
solved 330 tests with 8 threads in 1.19 seconds, 0.97 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.02 secs (54 Hz), max 0.57 secs).
solved 330 tests with 16 threads in 1.18 seconds, 0.99 speed-up 

python-3.12a03 (notice the small speed-up vs 3.9 on this apparently worst-case example)

All tests pass.
On a 8 cpu 3.12.0a3 (tags/v3.12.0a3:b6bd7ff, Dec  6 2022, 20:19:00) [MSC v.1934 64 bit (AMD64)]:
Solved 330 of 330 hardest puzzles (avg 0.00 secs (305 Hz), max 0.02 secs).
solved 330 tests with 1 threads in 1.13 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (156 Hz), max 0.08 secs).
solved 330 tests with 2 threads in 1.11 seconds, 1.01 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.01 secs (84 Hz), max 0.24 secs).
solved 330 tests with 4 threads in 1.13 seconds, 1.00 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.02 secs (49 Hz), max 0.36 secs).
solved 330 tests with 8 threads in 1.11 seconds, 1.01 speed-up 

Solved 330 of 330 hardest puzzles (avg 0.02 secs (62 Hz), max 0.36 secs).
solved 330 tests with 16 threads in 1.12 seconds, 1.01 speed-up 
stonebig commented 6 months ago

testing on Python-3.13.0b1 'free-threaded' vs 'normal' Python. (replacing 20 per 80 to have a better timing)

testing python-3.13.0b1 'free-threaded' sudoku on a Windows 4C/8T cpu:

C:\WinP\bdDocs\docs.nogil>python C:\WinP\bdDocs\docs.nogil\sudoku_thread_perf_comparison_typed.py
Solved 80 of 80 hard2 puzzles (avg 0.02 secs (47 Hz), max 0.03 secs).
solved 80 tests with 1 threads in 1.68 seconds

Solved 80 of 80 hard2 puzzles (avg 0.04 secs (24 Hz), max 0.10 secs).
solved 80 tests with 2 threads in 1.67 seconds

Solved 80 of 80 hard2 puzzles (avg 0.08 secs (12 Hz), max 0.24 secs).
solved 80 tests with 4 threads in 1.67 seconds

Solved 80 of 80 hard2 puzzles (avg 0.14 secs (7 Hz), max 0.52 secs).
solved 80 tests with 8 threads in 1.70 seconds

Solved 80 of 80 hard2 puzzles (avg 0.06 secs (15 Hz), max 0.28 secs).
solved 80 tests with 16 threads in 1.68 seconds

C:\WinP\bdDocs\docs.nogil>python3.13t.exe C:\WinP\bdDocs\docs.nogil\sudoku_thread_perf_comparison_typed.py
Solved 80 of 80 hard2 puzzles (avg 0.03 secs (29 Hz), max 0.04 secs).
solved 80 tests with 1 threads in 2.71 seconds

Solved 80 of 80 hard2 puzzles (avg 0.04 secs (25 Hz), max 0.05 secs).
solved 80 tests with 2 threads in 1.59 seconds

Solved 80 of 80 hard2 puzzles (avg 0.05 secs (18 Hz), max 0.07 secs).
solved 80 tests with 4 threads in 1.09 seconds

Solved 80 of 80 hard2 puzzles (avg 0.09 secs (10 Hz), max 0.11 secs).
solved 80 tests with 8 threads in 0.94 seconds

Solved 80 of 80 hard2 puzzles (avg 0.18 secs (5 Hz), max 0.25 secs).
solved 80 tests with 16 threads in 0.95 seconds