UFRN-URNAI / urnai-tools

A modular Deep Reinforcement Learning library that supports multiple environments, made with Python 3.6.
Apache License 2.0
5 stars 8 forks source link

Create base class for models #75

Closed alvarofpp closed 3 months ago

alvarofpp commented 9 months ago

We need to have a base class for the DRL models that will be used by the agents.

Suggestions:

CinquilCinquil commented 6 months ago

Hello, i was thinking of making the class constructor receive an Algorithm and a Data Structure for the model, and also define the following methods: learn, choose_action, get_action_values and get_state_values.

Am i on the right track?

alvarofpp commented 6 months ago

The Model is already the algorithm, so I don't think it needs to be given an algorithm because it already is. It should have a data structure to store the learning data, usually a dictionary.

If I remember correctly, the Agents will contain the states and actions, so it's not necessary to define this in the Model class. Consequently, I believe that the choose_action function will be in the Agent and not in the Model.

The learn function makes sense to me.

@UFRN-URNAI/urnai Does anyone else have anything to add here?

RickFqt commented 6 months ago

I also believe choose_action will be in the Agent, which will pass the current state and actions to the action choice strategy class in issue #74, so that it can make a decision. Is that right?

This also brought me a question: how is the Model class connected to the Agent and to the Action Choice Strategy class? Is it just to one of them? Or both?

alvarofpp commented 6 months ago

I believe that Model and Strategy have the same role here. We'll take this to the next meeting and see what everyone thinks, but I think Strategy will be Model.

RickFqt commented 6 months ago

It was discussed in our meeting that the method choose_action should be implemented inside the Strategy class, and Model should be something aside, more related to the "memory" of the agent. The Strategy class may or may not use the memory stored in Model in order to choose an action. It can, for example, choose an action in a complete random way.

It was also discussed the methods load and save, currently present in Model. There should be another class specifically for this, referenced in the issue #81.