CuriosAI / sai

SAI: a fork of Leela Zero with variable komi.
GNU General Public License v3.0
103 stars 11 forks source link

run error on i5-4200u notebook #75

Open l1t1 opened 4 years ago

l1t1 commented 4 years ago

gcard: AMD Radeon HD 8670M (similar to gt 730m from http://www.mydrivers.com/zhuanti/tianti/gpum/index.html) when run autogtp, it get the job, and exit with following msg

        "resignation_percent": "5",
        "visits": "2400"
    },
    "options_hash": "1fc612",
    "random_seed": "8877568699418147789",
    "required_client_version": "16",
    "selfplay_id": "5df1cf7b56f7037a989ac42d"
}

Got new job: selfplay
net: 095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d159803781d9db7.
*ERROR*: Could not talk to engine after launching.

when run sai -w

E:\sai>sai -w networks/095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d1598037
81d9db7.gz
Using OpenCL batch size of 5
Using 10 thread(s).
RNG seed: 7539055481323640198
SAI 0.17 release 4 (19x19) is a fork of Leela Zero.
Leela Zero Copyright (C) 2017-2019  Gian-Carlo Pascutto and contributors.
SAI Copyright (C) 2018-2019 SAI Team.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

BLAS Core: built-in Eigen 3.3.7 library.
Version 209 weights file (advanced board features + chain liberties + chain size
).
Detecting residual layers... v209
13 input planes, 1 input moves
192 channels... 9 blocks
2 policy outputs. Double value head. Type Y.
Common convolution: 5 outputs.
Alpha head: 384 channels. Beta head: 256 channels.
Initializing OpenCL (autodetecting precision).
Detected 2 OpenCL platforms.
Platform version: OpenCL 1.2 AMD-APP (1124.2)
Platform profile: FULL_PROFILE
Platform name:    AMD Accelerated Parallel Processing
Platform vendor:  Advanced Micro Devices, Inc.
Device ID:     0
Device name:   Hainan
Device type:   GPU
Device vendor: Advanced Micro Devices, Inc.
Device driver: 1124.2 (VM)
Device speed:  975 MHz
Device cores:  5 CU
Device score:  1112
Device ID:     1
Device name:   Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
Device type:   CPU
Device vendor: GenuineIntel
Device driver: 1124.2 (sse2,avx)
Device speed:  2494 MHz
Device cores:  2 CU
Device score:  512
Platform version: OpenCL 1.2
Platform profile: FULL_PROFILE
Platform name:    Intel(R) OpenCL
Platform vendor:  Intel(R) Corporation
Device ID:     2
Device name:   Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
Device type:   CPU
Device vendor: Intel(R) Corporation
Device driver: 1.2
Device speed:  2500 MHz
Device cores:  2 CU
Device score:  512
Selected platform: AMD Accelerated Parallel Processing
Selected device: Hainan
with OpenCL 1.2 capability.
Half precision compute support: No.
Tensor Core support: error: couldn't allocate input reg for constraint 'l'

when run sai with gpu 1 parameter, it works, but freeze after a while

Half precision compute support: No.
Tensor Core support: No.

Started OpenCL SGEMM tuner.
Will try 290 valid configurations.
(1/290) KWG=32 KWI=2 MDIMA=16 MDIMC=16 MWG=64 NDIMB=16 NDIMC=16 NWG=32 SA=1 SB=1
 STRM=0 STRN=0 TCE=0 VWM=4 VWN=2 340.4120 ms (1.0 GFLOPS)
(2/290) KWG=16 KWI=2 MDIMA=8 MDIMC=8 MWG=64 NDIMB=16 NDIMC=16 NWG=64 SA=1 SB=1 S
TRM=0 STRN=0 TCE=0 VWM=4 VWN=2 268.0297 ms (1.2 GFLOPS)
(7/290) KWG=32 KWI=8 MDIMA=16 MDIMC=16 MWG=64 NDIMB=8 NDIMC=8 NWG=64 SA=1 SB=1 S
TRM=0 STRN=0 TCE=0 VWM=4 VWN=4 235.4651 ms (1.4 GFLOPS)
l1t1 commented 4 years ago

after a long time, it finally works

Loaded existing SGEMM tuning.
Wavefront/Warp size: 1
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 1024
Setting max tree size to 4301 MiB and cache size to 477 MiB.

Passes: 0            Black (X) Prisoners: 0
Black (X) to move    White (O) Prisoners: 0
                     Komi: 7.5

SAI: genmove b
Debug...
Thinking at most 36.0 seconds...
NN eval=0.478326. Agent eval=0.483747
cpus=10
Playouts: 1, Win: 27.31%, PV: C3
Playouts: 12, Win: 45.13%, PV: C3 D17
Playouts: 21, Win: 46.15%, PV: C17 R3
Playouts: 31, Win: 44.63%, PV: C17 R16 Q3
Playouts: 45, Win: 43.57%, PV: R3 Q17 C16 D3
Playouts: 57, Win: 43.29%, PV: R16 C17 D3 R4

  R3 ->       6 (V: 49.40%) (LCB: 15.10%) (N:  4.73%) (A: -0.4) PV: R3 Q17 C16 C
4 R15
 R16 ->       7 (V: 51.45%) (LCB: 10.04%) (N:  4.06%) (A:  0.3) PV: R16 C17 D3 R
4 P3
  R4 ->       5 (V: 55.40%) (LCB:  3.97%) (N:  3.87%) (A:  2.0) PV: R4 C4 D17 Q1
7
  D4 ->       6 (V: 45.11%) (LCB:  1.90%) (N:  4.24%) (A: -2.2) PV: D4 C17 R16 Q
3 R5 C3
 C17 ->       7 (V: 45.67%) (LCB:  0.00%) (N:  5.06%) (A: -2.0) PV: C17 R16 Q3 C
4 E3 D3
  C3 ->       7 (V: 47.78%) (LCB:  0.00%) (N:  5.10%) (A: -0.9) PV: C3 D17 R16 Q
3 D16
 R17 ->       5 (V: 38.64%) (LCB:  0.00%) (N:  4.40%) (A: -4.8) PV: R17 C3 R4 D1
7
 D16 ->       5 (V: 43.72%) (LCB:  0.00%) (N:  4.54%) (A: -2.8) PV: D16 C3 R4 R1
6
  D3 ->       6 (V: 43.75%) (LCB:  0.00%) (N:  4.55%) (A: -4.1) PV: D3 R17 Q3 D1
7 Q16 R16
  C4 ->       6 (V: 48.03%) (LCB:  0.00%) (N:  4.05%) (A:  0.3) PV: C4 D17 Q3
 C16 ->       4 (V: 34.16%) (LCB:  0.00%) (N:  4.81%) (A: -7.3) PV: C16 R3 R16
 D17 ->       2 (V: 33.15%) (LCB:  0.00%) (N:  4.74%) (A: -7.3) PV: D17 R3
3.9 average depth, 7 max depth
42 non leaf nodes, 1.57 average children
67 visits, 23995 nodes, 66 playouts, 1 n/s

= R3

NN eval=0.453788. Agent eval=0.465327
l1t1 commented 4 years ago

use autogtp -u 1, freeze again

E:\sai>autogtp.exe --url http://sai.unich.it/ --username "a" --password b  -u 1
AutoGTP v18
Using 1 game thread(s) per device.
Starting tuning process, please wait...
net: 095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d159803781d9db7.
./sai --batchsize=5 --tune-only -w networks/095eda73769ee81664e54b28a0285b36dad7
1c9c9ed0a1e56d159803781d9db7.gz --gpu=1
SAI 0.17 release 4 (19x19) is a fork of Leela Zero.
Leela Zero Copyright (C) 2017-2019  Gian-Carlo Pascutto and contributors.
SAI Copyright (C) 2018-2019 SAI Team.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

Using OpenCL batch size of 5
Using 10 thread(s).

Found SAI Version : 0.17
Tuning process finished
Starting thread 1 on device 0
{
    "cmd": "selfplay",
    "hash": "095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d159803781d9db7",
    "hash_gzip_hash": "6a8f64c84ceb0cbc2fc8634a65ace83a1847ae18640ee3acff9d62837
9458bf8",
    "minimum_autogtp_version": "16",
    "minimum_leelaz_version": "0.15",
    "options": {
        "dumbpass": "false",
        "komi": "9",
        "lambda": "0.1",
        "noise": "true",
        "noise_value": "0.03",
        "other_options": "--nrsymm --restrict_tt --adv_features --chainlibs_feat
 --chainsize_feat --recordvisits --puct 1.0 --blunderthr 0.05 --randomtemp 1.0 -
-policy_temp 1.20",
        "playouts": "0",
        "randomcnt": "30",
        "resignation_percent": "5",
        "visits": "2400"
    },
    "options_hash": "942c47",
    "random_seed": "696779293399844104",
    "required_client_version": "16",
    "selfplay_id": "5df1cf7b56f7037a989ac41f"
}

Got new job: selfplay
net: 095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d159803781d9db7.
Engine has started.
time_settings 0 1 0
Starting GTP commands sent.
amato-gianluca commented 4 years ago

Are you using the GPU version under Windows? In my experiment, the combination Intel GPU + Windows + SAI GPU hangs very frequently and I have no idea why (LeelaZ also hangs for me on the same configuration). On the opposite, Intel GPU on Linux works very well.

On Thu, Dec 12, 2019 at 10:18 AM l1t1 notifications@github.com wrote:

use autogtp -u 1, freeze again

"options_hash": "942c47",
"random_seed": "696779293399844104",
"required_client_version": "16",
"selfplay_id": "5df1cf7b56f7037a989ac41f"

}

Got new job: selfplay net: 095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d159803781d9db7. Engine has started. time_settings 0 1 0 Starting GTP commands sent.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sai-dev/sai/issues/75?email_source=notifications&email_token=AAGUR3GTXVJSA2PIJLYBLMDQYH6WHA5CNFSM4JZ26G72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGV75QY#issuecomment-564920003, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUR3F7EZIREGD3FJGO5GLQYH6WHANCNFSM4JZ26G7Q .

l1t1 commented 4 years ago

leelaz 0.17 has same problem

E:\sai\leelaz>leelaz -w lz91.gz
Selected platform: AMD Accelerated Parallel Processing
Selected device: Hainan
with OpenCL 1.2 capability.
Half precision compute support: No.
Tensor Core support: error: couldn't allocate input reg for constraint 'l'

if I set --cpu-only option, it works

E:\sai\leelaz>leelaz -w lz91.gz --cpu-only
Using 2 thread(s).
RNG seed: 17880345583537492801
Leela Zero 0.17  Copyright (C) 2017-2019  Gian-Carlo Pascutto and contributors
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; see the COPYING file for details.

BLAS Core: Haswell
Detecting residual layers...v1...128 channels...6 blocks.
Initializing CPU-only evaluation.
Setting max tree size to 4360 MiB and cache size to 484 MiB.

Leela: netbench
 1600 evaluations in 39.41 seconds -> 40 n/s
=
l1t1 commented 4 years ago
E:\sai>sai -w networks/095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d1598037
81d9db7.gz --cpu-only
SAI: netbench
 1600 evaluations in 153.86 seconds -> 10 n/s
=

gpu 1 (Device name: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz,Device driver: 1124.2 (sse2,avx))is much slower

E:\sai>sai -w networks/095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d1598037
81d9db7.gz --gpu=1

SAI: netbench
 1600 evaluations in 1112.89 seconds -> 1 n/s
=

gpu 2(Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz, Device driver: 1.2)

E:\sai>sai -w networks/095eda73769ee81664e54b28a0285b36dad71c9c9ed0a1e56d1598037
81d9db7.gz --gpu=2

SAI: netbench
 1600 evaluations in 209.20 seconds -> 7 n/s
l1t1 commented 4 years ago

autogtp -u 2

E:\sai>autogtp.exe --url http://sai.unich.it/ --username "a" --password b
  -u 2
AutoGTP v18

Got new job: selfplay
net: c8f5f28b5cad60276858e679ff2546a65ff17a0fc92b1c5ad07ec3fac5271d3b.
Engine has started.
time_settings 0 1 0
Starting GTP commands sent.
1 (B P17)