Hi, love the project! I just saw your 102 million video on youtube and thought about what a "no-limit AI" really means in the context of NES tetris. Well, I dunno if you've taken a look at https://meatfighter.com/nintendotetrisai/#Picking_Tetriminos but it turns out that once you've figured out what your current RNG seed is (which can be done in a fairly small number of piece drops with a simple lookup table for all 32,767 possible starting seeds), you can simulate the entire future game outside of the game.
And once you can simulate it, you can also find branching paths towards optimal play. I assume pause-based RNG manip is completely off the table for you since you want this AI to play the game under the "rules" as if it's an impossibly talented human being, but minor rng manipulation is still possible based on how high up you place your pieces (ie, how many frames it takes for them to fall) and when you choose to complete your tetrises (since the frames it takes for line clears will advance the generator several--completely predictable--steps) and would be completely allowed for a human if they had a brain capable of understanding it.
Implementing this would be a major change to how the AI makes decisions, but would allow you to avoid more burns both by getting more line pieces (through minor rng manipulation, as determined by the AI's own decision trees) and by playing "riskier" (from the perspective of an outside observer), being able to stack higher when necessary under the knowledge of how much longer you have to wait for a future line piece.
Without pause-manipulation, it's theoretically impossible to completely eliminate all burns in all scenarios, but I think it wouldn't be impossible to make it to level 236+ with a single-digit number of burns in the entire game.
I wouldn't expect the AI to simulate every possible non-hole-creating position for every single piece, as that would obviously get exponentially large too quickly, but it could maintain a few trees that currently seem "most promising" based on the current placement algorithm, culling first and foremost whatever tree takes a burn in a time span in which any other tree found a tetris (and then obviously culling whatever trees become obsolete based on live piece placement, since you'll have to pick exactly one branch you're currently going with by the time you place a piece--though you'd be searching trees so far ahead that having a piece actually arrive at a branch that hasn't been culled to 1 option is something I don't yet know is realistic. Maybe if both branches allow you to continue playing perfectly, you'd have kept them both around? But at some point before that you should be picking and choosing arbitrary equal-value branches to cull simply to free up more memory for future branches).
I am not aware of potential desynchronization issues on NES hardware (ie, a frame failing to call the randomizer for some erratic reason, such as perhaps the slowdowns that happen in much later levels), but even if they do, you could fall back to the old behavior if a piece ends up being "wrong", and then start rebuilding trees after finding the "new" seed (which should only be 1 frame off in one direction or the other). Hell, NES hardware should be predictable enough that even if e.g. scores start clobbering the rng seed at higher values (which probably isn't the case), the AI could account for that deterministically as well.
Hi, love the project! I just saw your 102 million video on youtube and thought about what a "no-limit AI" really means in the context of NES tetris. Well, I dunno if you've taken a look at https://meatfighter.com/nintendotetrisai/#Picking_Tetriminos but it turns out that once you've figured out what your current RNG seed is (which can be done in a fairly small number of piece drops with a simple lookup table for all 32,767 possible starting seeds), you can simulate the entire future game outside of the game.
And once you can simulate it, you can also find branching paths towards optimal play. I assume pause-based RNG manip is completely off the table for you since you want this AI to play the game under the "rules" as if it's an impossibly talented human being, but minor rng manipulation is still possible based on how high up you place your pieces (ie, how many frames it takes for them to fall) and when you choose to complete your tetrises (since the frames it takes for line clears will advance the generator several--completely predictable--steps) and would be completely allowed for a human if they had a brain capable of understanding it.
Implementing this would be a major change to how the AI makes decisions, but would allow you to avoid more burns both by getting more line pieces (through minor rng manipulation, as determined by the AI's own decision trees) and by playing "riskier" (from the perspective of an outside observer), being able to stack higher when necessary under the knowledge of how much longer you have to wait for a future line piece.
Without pause-manipulation, it's theoretically impossible to completely eliminate all burns in all scenarios, but I think it wouldn't be impossible to make it to level 236+ with a single-digit number of burns in the entire game.
I wouldn't expect the AI to simulate every possible non-hole-creating position for every single piece, as that would obviously get exponentially large too quickly, but it could maintain a few trees that currently seem "most promising" based on the current placement algorithm, culling first and foremost whatever tree takes a burn in a time span in which any other tree found a tetris (and then obviously culling whatever trees become obsolete based on live piece placement, since you'll have to pick exactly one branch you're currently going with by the time you place a piece--though you'd be searching trees so far ahead that having a piece actually arrive at a branch that hasn't been culled to 1 option is something I don't yet know is realistic. Maybe if both branches allow you to continue playing perfectly, you'd have kept them both around? But at some point before that you should be picking and choosing arbitrary equal-value branches to cull simply to free up more memory for future branches).
I am not aware of potential desynchronization issues on NES hardware (ie, a frame failing to call the randomizer for some erratic reason, such as perhaps the slowdowns that happen in much later levels), but even if they do, you could fall back to the old behavior if a piece ends up being "wrong", and then start rebuilding trees after finding the "new" seed (which should only be 1 frame off in one direction or the other). Hell, NES hardware should be predictable enough that even if e.g. scores start clobbering the rng seed at higher values (which probably isn't the case), the AI could account for that deterministically as well.