Open caymansimpson opened 2 months ago
Also I need to add back teampreview as a parameter (and test teampreview embedding), with documentation about what I don’t embed
also clip large parameters and normalize in embedder
need to include representation of opponents public info; during decision time, we threshold and discretion probabilities to reduce mistakes; deepnash learns teampreview and gameplay together; they critique CFR cuz it grows exponentially with infosets; DREAM uses outcome sampling to scale to large games, but rely on importance sampling to remain unbiased (need to look up and learn more about both); need low regularization parameter to deal with stochasticity since we may learn wrong things due to chance; need to understand how to encode last moves; learn from their infra setup;
also consider adding large vs small model for Embedder as a parameter (tiering each feature) to simplify and help with testing
Also add battle representation of last 10 turns
Also do this in a learner/actor/replaybuffer dynamic. Look up poke-env to better understand whether we can plug and play
Set up a DQN example
Takes data from Data Processor, creates inputs (eg embeds battles), values and trains models w/ parameters. Needs to scale to ESCHER. Blocked on embedder and data processor
Also add tests to understand if there any abnormally large features