One way to train the hyper parameters like, piece values, piece tables, possibly how much weight should a is in check have, etc., is to train these through Reinforcement Learning where they play against an AI that can think in much higher depths (i.e. one that is written in Rust or C). This way the AI could potentially learn tricks that could help when playing against AI's that should completely destroy it.
Though I don't think that this is work too well, I do think that it could at least help a bit. It also seems like a fun little experiment and it also gives me an excuse to write some more Rust (or potentially C).
One big issue here is the question of how to reward the AI. Obviously it will fail most of the time, if not all. So a reward purely on winning and loosing is pretty meaningless. Instead, I think it would be smart to reward the AI for "surviving" longer, i.e. playing more moves, and punish it when it loses very quickly. It should also be rewarded when it looses with a smaller difference in evaluation compared to a huge difference.
I don't think this will be too great of an improvement, but I think it could be very fun and teaching.
One way to train the hyper parameters like, piece values, piece tables, possibly how much weight should a is in check have, etc., is to train these through Reinforcement Learning where they play against an AI that can think in much higher depths (i.e. one that is written in Rust or C). This way the AI could potentially learn tricks that could help when playing against AI's that should completely destroy it.
Though I don't think that this is work too well, I do think that it could at least help a bit. It also seems like a fun little experiment and it also gives me an excuse to write some more Rust (or potentially C).
One big issue here is the question of how to reward the AI. Obviously it will fail most of the time, if not all. So a reward purely on winning and loosing is pretty meaningless. Instead, I think it would be smart to reward the AI for "surviving" longer, i.e. playing more moves, and punish it when it loses very quickly. It should also be rewarded when it looses with a smaller difference in evaluation compared to a huge difference.
I don't think this will be too great of an improvement, but I think it could be very fun and teaching.