Open lkcao opened 2 years ago
I didn't see the chapters posted yet, but I did have a question. Could you discuss the costs of doing reinforcement learning versus modeling something through an experiment with people? Obviously, there are physical costs, the need to get people to participate, etc.. But I was wondering about things someone should think about beyond that. Do you think that in the future there will be scenarios only modeled with reinforcement learning?
One question for reinforcement learning: What's the CPU/GPU cost for training multiple RL agents? Btw, can we get the link to these two chapters when they're made available, say, next quarter? thanks :)
What are some of the main challenges in applying reinforcement learning?
General question about reinforcement learning and AI: although large language models and multimodal networks are blowing everyone's minds off (for good reasons), it feels like Artificial General Intelligence (AGI), if ever achieved, will be achieved through some version of reinforcement learning because its fixation on reward maximization. Would you bet on the idea that AGI can be achieved through RL/DL and gradient descent? What do you think makes RL different compared to task-specific super powerful models in its potential for leading us to AGI, besides that fixation on rewards? Thanks!
A quick question related to reinforcement learning: we could relatively easily imagine how to train a reinforcement learning model with only one agent since there is only one decision-making process. I am wondering what would be a general way to configure a multi-agent reinforcement learning model? Would those agents make their strategies simultaneously or have the order? Or does it depend on the main problem? If it is the case, could you introduce some typical cases?
Since reinforced learning is a process, and the model would become stronger and stronger with the feedback received, I wonder is that just like our human brain, practice and practice and so on until we become familiar one thing. Then what does the difference between animals' brain, human brain, and "computer brain", is the learning speed? Is reinforced learning a concept from our brain evolution, or it provides a way for us to think of the future of brain development (it may takes thousands of years biologically)? Really surprised to see its great potential not mimic human brain but advance the human intelligence.
I am just curious that whether the reinforcement learning could perform good on object dectection of the small sample of images?
I am interested in hearing more using reinforcement learning in competitive video games.
In response to Baotong, the OpenAI team's OpenAI five on Dota 2 is well-known as worth looking at. King's honor and Tencent researcher also used reinforcement measures to retrain a macro strategy agent.
Adding to what everyone has already asked, I guess I'd like to know what is the financial cost of doing something like reinforcement learning compared to some of the other methods?
When solving the dynamic programming problem, usually we set a value function for iteration. A important issue is how to set the state variables, or actions of the value function; it is different from the choice variables from the dynamic problems, and is important knowledge to know when making a decision. It is an art to choose appropriate state variables in the dynamic programming problems. Therefore, I was wondering is there any rule of thumbs for choosing them in social science questions?
I've seen complex agent-based models that account for complex behaviors in many social science domains, and I'm curious about how reinforcement learning can model human behaviors beyond bandit-like decision making tasks (which is common in our field) at 'psychological' individual level, which accounts for subtle behavior patterns in real time and cares about individual differences (rather than group-level inferences).
Post your question here about the orienting readings: “Reinforcement Learning” and “Deep Reinforcement Learning”, Thinking with Deep Learning, Chapters 15 & 16.