Lizzie 0.6 doesn't analyze with 2x RTX 2080 Ti GPUs = Bug

I'm using Leela Zero 0.17 + AutoGTP v18. With one gpu it works fine.

But with two gpus it doesn't work. Lizzie 0.6 is open but: Leela Zero is loading...!!! I can use the x button and do moves on the board.

{ "leelaz": { "max-analyze-time-minutes": 60, "analyze-update-interval-centisec": 10, "network-file": "network.gz", "max-game-thinking-time-seconds": 2, "engine-start-location": ".", "engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1", "print-comms": false }, "ui": { "comment-font-size": 0, "board-color": [ 217, 152, 77 ], "shadow-size": 100, "show-winrate": true, "autosave-interval-seconds": -1, "append-winrate-to-comment": true, "fancy-board": true, "show-captured": true, "weighted-blunder-bar-height": false, "--gpu 0 --gpu 1 --gpu 2 --gpu 3": true, "win-rate-always-black": false, "show-move-number": true, "winrate-stroke-width": 3, "show-next-moves": true, "show-comment": true, "show-leelaz-variation": true, "theme": "default", "min-playout-ratio-for-stats": 0, "fancy-stones": true, "resume-previous-game": false, "window-size": [ 3840, 2160 ], "new-move-number-in-branch": true, "shadows-enabled": true, "show-variation-graph": true, "show-dynamic-komi": true, "minimum-blunder-bar-width": 3, "large-winrate": false, "show-blunder-bar": true, "only-last-move-number": 1, "confirm-exit": false, "show-status": true, "handicap-instead-of-winrate": false, "large-subboard": false, "dynamic-winrate-graph-width": true, "show-subboard": true, "window-maximized": true, "show-best-moves": true, "board-size": 19 } }

What happens when you run that Leelaz command from a command line?

On Wed, May 15, 2019, 7:16 PM superbnet notifications@github.com wrote:

I'm using Leela Zero 0.17 + AutoGTP v18. With one gpu it works fine.

But with two gpus it doesn't work. Lizzie 0.6 is open but: Leela Zero is loading...!!! I can use the x button and do moves on the board.

{ "leelaz": { "max-analyze-time-minutes": 60, "analyze-update-interval-centisec": 10, "network-file": "network.gz", "max-game-thinking-time-seconds": 2, "engine-start-location": ".", "engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1", "print-comms": false }, "ui": { "comment-font-size": 0, "board-color": [ 217, 152, 77 ], "shadow-size": 100, "show-winrate": true, "autosave-interval-seconds": -1, "append-winrate-to-comment": true, "fancy-board": true, "show-captured": true, "weighted-blunder-bar-height": false, "--gpu 0 --gpu 1 --gpu 2 --gpu 3": true, "win-rate-always-black": false, "show-move-number": true, "winrate-stroke-width": 3, "show-next-moves": true, "show-comment": true, "show-leelaz-variation": true, "theme": "default", "min-playout-ratio-for-stats": 0, "fancy-stones": true, "resume-previous-game": false, "window-size": [ 3840, 2160 ], "new-move-number-in-branch": true, "shadows-enabled": true, "show-variation-graph": true, "show-dynamic-komi": true, "minimum-blunder-bar-width": 3, "large-winrate": false, "show-blunder-bar": true, "only-last-move-number": 1, "confirm-exit": false, "show-status": true, "handicap-instead-of-winrate": false, "large-subboard": false, "dynamic-winrate-graph-width": true, "show-subboard": true, "window-maximized": true, "show-best-moves": true, "board-size": 19 } }

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/featurecat/lizzie/issues/530?email_source=notifications&email_token=ACQHLMX524T6U2O7IDC3GVDPVSKVNA5CNFSM4HNHWEOKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUBLW3A, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQHLMTWJDSKVQYUSXZY6ZDPVSKVNANCNFSM4HNHWEOA .

I don't know if this is correct: Z:\LG0\Lizzie\leela-zero\leelaz.exe

Little black window opened for a half second.

this one:

Z:\LG0\Lizzie\leela-zero\leelaz.exe--gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1

On Thu, May 16, 2019, 1:46 PM superbnet notifications@github.com wrote:

I don't know if this is correct: Z:\LG0\Lizzie\leela-zero\leelaz.exe

Little black window opened for a half second.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/featurecat/lizzie/issues/530?email_source=notifications&email_token=ACQHLMQQEYCZQ5UQNR6KFR3PVWMWDA5CNFSM4HNHWEOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVSR4WI#issuecomment-493166169, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQHLMWRC6R6OUBPBRW5EELPVWMWDANCNFSM4HNHWEOA .

Z:\LG0\Lizzie\leela-zero\leelaz.exe --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1

Little black window opened for a half second.

oh oops you need to specify a weights file. I can get back to you with a solution tomorrow.

On Thu, May 16, 2019, 2:16 PM superbnet notifications@github.com wrote:

Z:\LG0\Lizzie\leela-zero\leelaz.exe --gtp --lagbuffer 0 --weights %network-file --gpu 0 --gpu 1

Little black window opened for a half second.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/featurecat/lizzie/issues/530?email_source=notifications&email_token=ACQHLMQNTOLAAIFI2QH7EJDPVWQJJA5CNFSM4HNHWEOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVSUQTA#issuecomment-493176908, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQHLMTC77AHMLV5ND5TBHLPVWQJJANCNFSM4HNHWEOA .

I thing you asked for this right:

Using OpenCL batch size of 5 Using 20 thread(s). RNG seed: 9659931005586854004 Using per-move time margin of 0.00s. BLAS Core: Sandybridge Detecting residual layers...v1...256 channels...40 blocks. Initializing OpenCL (autodetecting precision). Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes. Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes.

Started OpenCL SGEMM tuner. Will try 290 valid configurations. (1/290) KWG=32 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 80.8373 ms (7.3 GFLOPS) (3/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STR M=0 STRN=0 TCE=0 VWM=2 VWN=2 56.7485 ms (10.4 GFLOPS) (15/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 46.0721 ms (12.8 GFLOPS) (37/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=2 VWN=4 45.3389 ms (13.0 GFLOPS) (97/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=2 VWN=2 44.9333 ms (13.1 GFLOPS) (168/290) KWG=32 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=2 43.7751 ms (13.5 GFLOPS) Wavefront/Warp size: 1 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 1024

Started OpenCL SGEMM tuner. Will try 290 valid configurations. (1/290) KWG=32 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1471 ms (4008.9 GFLOPS) (6/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STR M=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1422 ms (4146.7 GFLOPS) (8/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1258 ms (4689.5 GFLOPS) (10/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1218 ms (4841.3 GFLOPS) (18/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1106 ms (5332.2 GFLOPS) (22/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.1092 ms (5402.9 GFLOPS) (26/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1069 ms (5518.1 GFLOPS) (29/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1063 ms (5551.0 GFLOPS) (31/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0981 ms (6013.2 GFLOPS) (34/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0947 ms (6230.2 GFLOPS) (39/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0879 ms (6708.6 GFLOPS) (46/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0799 ms (7379.4 GFLOPS) (108/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0765 ms (7708.9 GFLOPS) (178/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0760 ms (7756.8 GFLOPS) (227/290) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0752 ms (7843.4 GFLOPS) (240/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0725 ms (8130.6 GFLOPS) (283/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0722 ms (8172.0 GFLOPS) Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64

Started OpenCL SGEMM tuner. Will try 290 valid configurations. Failed to compile: 290 kernels. Failed to find a working configuration. Check your OpenCL drivers. Minimum error: 100.000000. Error bound: 0.100000 Using OpenCL single precision (half precision failed to run). Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes. Loaded existing SGEMM tuning. Wavefront/Warp size: 1 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 1024 Loaded existing SGEMM tuning. Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64 Setting max tree size to 3660 MiB and cache size to 406 MiB.

Yes that's right. What was the exact command you used?

On Thu, May 16, 2019, 4:28 PM superbnet notifications@github.com wrote:

I thing you asked for this right:

Using OpenCL batch size of 5 Using 20 thread(s). RNG seed: 9659931005586854004 Using per-move time margin of 0.00s. BLAS Core: Sandybridge Detecting residual layers...v1...256 channels...40 blocks. Initializing OpenCL (autodetecting precision). Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes. Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes.

Started OpenCL SGEMM tuner. Will try 290 valid configurations. (1/290) KWG=32 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 80.8373 ms (7.3 GFLOPS) (3/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STR M=0 STRN=0 TCE=0 VWM=2 VWN=2 56.7485 ms (10.4 GFLOPS) (15/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 46.0721 ms (12.8 GFLOPS) (37/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=2 VWN=4 45.3389 ms (13.0 GFLOPS) (97/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=2 VWN=2 44.9333 ms (13.1 GFLOPS) (168/290) KWG=32 KWI=8 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=2 43.7751 ms (13.5 GFLOPS) Wavefront/Warp size: 1 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 1024

Started OpenCL SGEMM tuner. Will try 290 valid configurations. (1/290) KWG=32 KWI=2 MDIMA=32 MDIMC=32 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1471 ms (4008.9 GFLOPS) (6/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STR M=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1422 ms (4146.7 GFLOPS) (8/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 S TRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1258 ms (4689.5 GFLOPS) (10/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1218 ms (4841.3 GFLOPS) (18/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=32 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1106 ms (5332.2 GFLOPS) (22/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.1092 ms (5402.9 GFLOPS) (26/290) KWG=32 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.1069 ms (5518.1 GFLOPS) (29/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=32 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=2 VWN=4 0.1063 ms (5551.0 GFLOPS) (31/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 ST RM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0981 ms (6013.2 GFLOPS) (34/290) KWG=16 KWI=8 MDIMA=8 MDIMC=8 MWG=32 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0947 ms (6230.2 GFLOPS) (39/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB= 1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0879 ms (6708.6 GFLOPS) (46/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0799 ms (7379.4 GFLOPS) (108/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 0.0765 ms (7708.9 GFLOPS) (178/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0760 ms (7756.8 GFLOPS) (227/290) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0752 ms (7843.4 GFLOPS) (240/290) KWG=16 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0725 ms (8130.6 GFLOPS) (283/290) KWG=16 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 STRM=0 STRN=0 TCE=0 VWM=4 VWN=4 0.0722 ms (8172.0 GFLOPS) Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64

Started OpenCL SGEMM tuner. Will try 290 valid configurations. Failed to compile: 290 kernels. Failed to find a working configuration. Check your OpenCL drivers. Minimum error: 100.000000. Error bound: 0.100000 Using OpenCL single precision (half precision failed to run). Detected 2 OpenCL platforms. Platform version: OpenCL 2.0 AMD-APP (2079.4) Platform profile: FULL_PROFILE Platform name: AMD Accelerated Parallel Processing Platform vendor: Advanced Micro Devices, Inc. Device ID: 0 Device name: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz Device type: CPU Device vendor: GenuineIntel Device driver: 2079.4 (sse2,avx) Device speed: 3200 MHz Device cores: 6 CU Device score: 520 Platform version: OpenCL 1.2 CUDA 10.0.150 Platform profile: FULL_PROFILE Platform name: NVIDIA CUDA Platform vendor: NVIDIA Corporation Device ID: 1 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Device ID: 2 Device name: GeForce RTX 2080 Ti Device type: GPU Device vendor: NVIDIA Corporation Device driver: 411.70 Device speed: 1545 MHz Device cores: 68 CU Device score: 1112 Selected platform: AMD Accelerated Parallel Processing Selected device: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz with OpenCL 2.0 capability. Half precision compute support: No. Tensor Core support: No. Selected platform: NVIDIA CUDA Selected device: GeForce RTX 2080 Ti with OpenCL 1.2 capability. Half precision compute support: No. Tensor Core support: Yes. Loaded existing SGEMM tuning. Wavefront/Warp size: 1 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 1024 Loaded existing SGEMM tuning. Wavefront/Warp size: 32 Max workgroup size: 1024 Max workgroup dimensions: 1024 1024 64 Setting max tree size to 3660 MiB and cache size to 406 MiB.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/featurecat/lizzie/issues/530?email_source=notifications&email_token=ACQHLMSPWPYSJE3SVUPCWHTPVW7W7A5CNFSM4HNHWEOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVS7B5A#issuecomment-493220084, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQHLMUJVSERLIUAPX2BALDPVW7W7ANCNFSM4HNHWEOA .

{ "leelaz": { "max-analyze-time-minutes": 60, "analyze-update-interval-centisec": 10, "network-file": "network.gz", "max-game-thinking-time-seconds": 2, "engine-start-location": ".", "engine-command": "./leela-zero/leelaz --gtp --lagbuffer 0 --weights %network-file --gpu 1 --gpu 2", "print-comms": false }, "ui": { "comment-font-size": 0, "board-color": [ 217, 152, 77 ], "shadow-size": 100, "show-winrate": true, "autosave-interval-seconds": -1, "append-winrate-to-comment": true, "fancy-board": true, "show-captured": true, "weighted-blunder-bar-height": false, "--gpu 0 --gpu 1 --gpu 2 --gpu 3": true, "win-rate-always-black": false, "show-move-number": true, "winrate-stroke-width": 3, "show-next-moves": true, "show-comment": true, "show-leelaz-variation": true, "theme": "default", "min-playout-ratio-for-stats": 0, "fancy-stones": true, "resume-previous-game": false, "window-size": [ 3840, 2160 ], "new-move-number-in-branch": true, "shadows-enabled": true, "show-variation-graph": true, "show-dynamic-komi": true, "minimum-blunder-bar-width": 3, "large-winrate": false, "show-blunder-bar": true, "only-last-move-number": 1, "confirm-exit": false, "show-status": true, "handicap-instead-of-winrate": false, "large-subboard": false, "dynamic-winrate-graph-width": true, "show-subboard": true, "window-maximized": true, "show-best-moves": true, "board-size": 19 } }

Problem fixed:)

It needs to be gpu 1 and gpu 2. Not gpu 0 and gpu 1.

Because device 1 is gpu 1 and device 2 is gpu 2.

gpu 0 is device 0 and device 0 is the cpu!!!

To use gpu 0 and gpu 1 means to use cpu and a gpu and this does not work.

featurecat / lizzie

Lizzie 0.6 doesn't analyze with 2x RTX 2080 Ti GPUs = Bug #530