Closed nav-28 closed 1 year ago
Many thanks for this @nav-28 !
I'll try to run them. I have a question first: what are the use of the huge csv
files (for insufficient material it seems)?
I'd leave the tests in position_test.dart
because they run very fast. But yes we should try to run the other perft tests on CI somehow.
Running the tests on CI sounds good, but just have to figure out how to run the tests due to the issue I mentioned above. Maybe delete/rename the dart_test.yaml
before running on CI.
just curious, why do we have to exclude this from dart test
?
just curious, why do we have to exclude this from
dart test
?
If it's not excluded, it will run with all the other tests taking up much time. So had to add the tag to exclude it from running. But currently, there is no way to run excluded tests due to a bug in dart test
A thing I noticed was that the test case positions are epd not fen and I am using parseFen to set up the board. I looked at the spec of epd and didn't find much difference from fen. I hope that is not gonna cause any problems
If it's not excluded, it will run with all the other tests taking up much time. So had to add the tag to exclude it from running. But currently, there is no way to run excluded tests due to a bug in
dart test
If we limit the nodes
small enough I think it's okay. When we work local, we only need to run the tests that related to our work, so we don't have to run perft tests all the times.
Okay, that makes sense. I can look into that
Ran for nodes less than 10 million. There are the errors I got.
Expected: <197326>
Actual: <197322>
id: atomic-start
fen: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
depth: 4
nodes: 197326
package:test_api expect
test/perft_test.dart 50:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic atomic-start rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 4'
00:05 +35 -2: Atomic Atomic programfox-1 rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4 [E]
Expected: <1434825>
Actual: <1434736>
id: programfox-1
fen: rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq -
depth: 4
nodes: 1434825
package:test_api expect
test/perft_test.dart 50:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic programfox-1 rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4'
00:06 +38 -3: Atomic Atomic programfox-2 rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq - 4 [E]
Expected: <714499>
Actual: <714474>
id: programfox-2
fen: rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq -
depth: 4
nodes: 714499
package:test_api expect
test/perft_test.dart 50:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic programfox-2 rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq - 4'
00:19 +56 -4: Atomic Atomic shakmaty-bench rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4 [E]
Expected: <1434825>
Actual: <1434736>
id: shakmaty-bench
fen: rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq -
depth: 4
nodes: 1434825
package:test_api expect
test/perft_test.dart 50:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic shakmaty-bench rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4'
00:23 +65 -5: Crazyhouse zh-middlegame r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq - 4 [E]
Expected: <2083382>
Actual: <2081591>
id: zh-middlegame
fen: r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq -
depth: 4
nodes: 2083382
package:test_api expect
test/perft_test.dart 66:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Crazyhouse zh-middlegame r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq - 4'
00:25 +151 -6: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 1 [E]
Expected: <6>
Actual: <5>
id: align-ep
fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
depth: 1
nodes: 6
package:test_api expect
test/perft_test.dart 116:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 1'
00:25 +151 -7: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 2 [E]
Expected: <121>
Actual: <105>
id: align-ep
fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
depth: 2
nodes: 121
package:test_api expect
test/perft_test.dart 116:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 2'
00:25 +151 -8: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 3 [E]
Expected: <711>
Actual: <592>
id: align-ep
fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
depth: 3
nodes: 711
package:test_api expect
test/perft_test.dart 116:11 main.<fn>.<fn>
To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 3'
13:16 +4417 -8: Some tests failed.
Consider enabling the flag chain-stack-traces to receive more detailed exceptions.
For example, 'dart test --chain-stack-traces'.```
So there's one for standard chess that fails at depth 1? I'll look into it.
Yeah, just one that fails for standard chess. Just pointing out that I had ignoreImpossibleCheck
to true
for that test as I was getting a impossibleCheck
error for the position
@nav-28 instead of having a dart_test.yaml
you can exclude tags with command line:
dart test -x perft
that way we can separate tests easily, even on CI.
Thats great. I will remove the YAML file. How do you want to separate the tests? Do you want all the perft tests to run on CI or just a subset of them? I tried separating them by nodes of less than 10 Mil and it took around 13 mins to run on my MacBook.
Yes I think running a subset with a node number threshold is fine. How do you actually make that threshold?
Ideally we should have 2 perft test suites, one under 10M that we run on CI and the full one that we can try to launch at least once, on demand, to verify. (I have no idea how long it would run though...).
Node threshold logic could be added to the perft tests I guess?
I added 2 files perft_test
and full_perft_test
The first one only runs on 10 mil nodes. Also updated the CI. Now the tests finish in under 2 mins. Haven't run the full perft tests yet.
Great! Now we need to fix the lib to make tests pass ;)
So this is all good. I created 3 new issues related to perft tests that now fail. So we can comment them out in this PR to make the build pass and we can open new PRs per variant to fix the issues.
Also I think the new perft suite covers everything that was in position_test.dart
so we can remove perft tests there now. There is just the initial position perft for standard chess that is missing in the new files.
Once we have that I can merge this PR.
Revolves #11
These tests won't run when
dart test
is run. Current, to run the tests, we have to remove the exclude tag line fromdart_test.yaml
. This is due to an unresolved issue https://github.com/dart-lang/test/issues/1764 After removing the tag, run the test withdart test -t perft test/perft_test.dart
I haven't run all the tests, but there were some errors in Atomic and Crazyhouse.
If someone can run all the tests and paste the results, that would be great.
Tests are taken from scalachess