hzwer / ICCV2019-LearningToPaint

ICCV2019 - Learning to Paint With Model-based Deep Reinforcement Learning
MIT License
2.25k stars 312 forks source link

Divide parameter and k=5 #58

Closed psalomonr closed 2 years ago

psalomonr commented 2 years ago

Hello :)

I have some doubts...

I have seen that in the algorithm a "divide" parameter is defined which divides the Canvas into mini canvas in order to improve the agent accuracy. But.... I would like to understand when this action is performed during the training (what are the steps). when the actor is going to make a stroke, the canvas is divided and then it is reconstructed?

Also I have seen that for each state the actor performs 5 actions (brush strokes), I understand that the discriminate gives the reward to the actor. But what about with respect to the critic? update q for each of the five actions?

Thank you very much in advance

hzwer commented 2 years ago

Hello, when using action bundle, for example k=5, which means we treat five strokes as one action. The critic also regards five strokes as a whole action to predict Q.

psalomonr commented 2 years ago

thanks so much ! and what about divide parameter? :)

hzwer commented 2 years ago

@psalomonr We did not mention the divide parameter in this paper. It is just used to make the demo look better. Dividing the canvas is an artificial strategy introduced during the inference stage.

psalomonr commented 2 years ago

thanks so much for your answers @hzwer !