lichess-org / dartchess

Dart chess library for native platforms
https://pub.dev/packages/dartchess
GNU General Public License v3.0
33 stars 17 forks source link

Add perft tests #13

Closed nav-28 closed 1 year ago

nav-28 commented 1 year ago

Revolves #11

These tests won't run when dart test is run. Current, to run the tests, we have to remove the exclude tag line from dart_test.yaml. This is due to an unresolved issue https://github.com/dart-lang/test/issues/1764 After removing the tag, run the test with dart test -t perft test/perft_test.dart

I haven't run all the tests, but there were some errors in Atomic and Crazyhouse.

If someone can run all the tests and paste the results, that would be great.

Tests are taken from scalachess

veloce commented 1 year ago

Many thanks for this @nav-28 !

I'll try to run them. I have a question first: what are the use of the huge csv files (for insufficient material it seems)?

veloce commented 1 year ago

I'd leave the tests in position_test.dart because they run very fast. But yes we should try to run the other perft tests on CI somehow.

nav-28 commented 1 year ago

Running the tests on CI sounds good, but just have to figure out how to run the tests due to the issue I mentioned above. Maybe delete/rename the dart_test.yaml before running on CI.

lenguyenthanh commented 1 year ago

just curious, why do we have to exclude this from dart test?

nav-28 commented 1 year ago

just curious, why do we have to exclude this from dart test?

If it's not excluded, it will run with all the other tests taking up much time. So had to add the tag to exclude it from running. But currently, there is no way to run excluded tests due to a bug in dart test

nav-28 commented 1 year ago

A thing I noticed was that the test case positions are epd not fen and I am using parseFen to set up the board. I looked at the spec of epd and didn't find much difference from fen. I hope that is not gonna cause any problems

lenguyenthanh commented 1 year ago

If it's not excluded, it will run with all the other tests taking up much time. So had to add the tag to exclude it from running. But currently, there is no way to run excluded tests due to a bug in dart test

If we limit the nodes small enough I think it's okay. When we work local, we only need to run the tests that related to our work, so we don't have to run perft tests all the times.

nav-28 commented 1 year ago

Okay, that makes sense. I can look into that

nav-28 commented 1 year ago

Ran for nodes less than 10 million. There are the errors I got.


  Expected: <197326>
    Actual: <197322>
  id: atomic-start
  fen: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
  depth: 4
  nodes: 197326

  package:test_api            expect
  test/perft_test.dart 50:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic atomic-start rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 4'
00:05 +35 -2: Atomic Atomic programfox-1 rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4 [E]
  Expected: <1434825>
    Actual: <1434736>
  id: programfox-1
  fen: rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq -
  depth: 4
  nodes: 1434825

  package:test_api            expect
  test/perft_test.dart 50:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic programfox-1 rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4'
00:06 +38 -3: Atomic Atomic programfox-2 rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq - 4 [E]
  Expected: <714499>
    Actual: <714474>
  id: programfox-2
  fen: rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq -
  depth: 4
  nodes: 714499

  package:test_api            expect
  test/perft_test.dart 50:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic programfox-2 rn1qkb1r/p5pp/2p5/3p4/N3P3/5P2/PPP4P/R1BQK3 w Qkq - 4'
00:19 +56 -4: Atomic Atomic shakmaty-bench rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4 [E]
  Expected: <1434825>
    Actual: <1434736>
  id: shakmaty-bench
  fen: rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq -
  depth: 4
  nodes: 1434825

  package:test_api            expect
  test/perft_test.dart 50:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Atomic Atomic shakmaty-bench rn2kb1r/1pp1p2p/p2q1pp1/3P4/2P3b1/4PN2/PP3PPP/R2QKB1R b KQkq - 4'
00:23 +65 -5: Crazyhouse zh-middlegame r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq - 4 [E]
  Expected: <2083382>
    Actual: <2081591>
  id: zh-middlegame
  fen: r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq -
  depth: 4
  nodes: 2083382

  package:test_api            expect
  test/perft_test.dart 66:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Crazyhouse zh-middlegame r1bqk2r/pppp1ppp/2n1p3/4P3/1b1Pn3/2NB1N2/PPP2PPP/R1BQK2R[] b KQkq - 4'
00:25 +151 -6: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 1 [E]
  Expected: <6>
    Actual: <5>
  id: align-ep
  fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
  depth: 1
  nodes: 6

  package:test_api             expect
  test/perft_test.dart 116:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 1'
00:25 +151 -7: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 2 [E]
  Expected: <121>
    Actual: <105>
  id: align-ep
  fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
  depth: 2
  nodes: 121

  package:test_api             expect
  test/perft_test.dart 116:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 2'
00:25 +151 -8: Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 3 [E]
  Expected: <711>
    Actual: <592>
  id: align-ep
  fen: 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3
  depth: 3
  nodes: 711

  package:test_api             expect
  test/perft_test.dart 116:11  main.<fn>.<fn>

To run this test again: /Users/nav/development/flutter/bin/cache/dart-sdk/bin/dart test test/perft_test.dart -p vm --plain-name 'Chess Tricky align-ep 8/8/8/1k6/3Pp3/8/8/4KQ2 b - d3 3'
13:16 +4417 -8: Some tests failed.

Consider enabling the flag chain-stack-traces to receive more detailed exceptions.
For example, 'dart test --chain-stack-traces'.```
veloce commented 1 year ago

So there's one for standard chess that fails at depth 1? I'll look into it.

nav-28 commented 1 year ago

Yeah, just one that fails for standard chess. Just pointing out that I had ignoreImpossibleCheck to true for that test as I was getting a impossibleCheck error for the position

veloce commented 1 year ago

@nav-28 instead of having a dart_test.yaml you can exclude tags with command line:

dart test -x perft

that way we can separate tests easily, even on CI.

nav-28 commented 1 year ago

Thats great. I will remove the YAML file. How do you want to separate the tests? Do you want all the perft tests to run on CI or just a subset of them? I tried separating them by nodes of less than 10 Mil and it took around 13 mins to run on my MacBook.

veloce commented 1 year ago

Yes I think running a subset with a node number threshold is fine. How do you actually make that threshold?

Ideally we should have 2 perft test suites, one under 10M that we run on CI and the full one that we can try to launch at least once, on demand, to verify. (I have no idea how long it would run though...).

Node threshold logic could be added to the perft tests I guess?

nav-28 commented 1 year ago

I added 2 files perft_test and full_perft_test The first one only runs on 10 mil nodes. Also updated the CI. Now the tests finish in under 2 mins. Haven't run the full perft tests yet.

veloce commented 1 year ago

Great! Now we need to fix the lib to make tests pass ;)

veloce commented 1 year ago

So this is all good. I created 3 new issues related to perft tests that now fail. So we can comment them out in this PR to make the build pass and we can open new PRs per variant to fix the issues.

Also I think the new perft suite covers everything that was in position_test.dart so we can remove perft tests there now. There is just the initial position perft for standard chess that is missing in the new files. Once we have that I can merge this PR.