DRL algorithm with api - Githubissues

nnstreamer / nntrainer

NNtrainer is Software Framework for Training Neural Network Models on Devices.

Apache License 2.0

147 stars 74 forks source link

DRL algorithm with api #2528

Open eightreal opened 8 months ago

eightreal commented 8 months ago

Hello , Dear Contributors I notice that the application DQN don't use the api .h file. And there only exists defined loss function, so if I want to develop a DQN methods, I would like to ask you to confirm the following.

Is there an interface or method to customize the Los function?
Can I copy the header file you used in Aplication / DRL, and if so, which release package should I use? nntrainer-devel?

Or you have better advice.

taos-ci commented 8 months ago

:octocat: cibot: Thank you for posting issue #2528. The person in charge will reply soon.

myungjoo commented 8 months ago

Example: https://github.com/nnstreamer/nntrainer/blob/main/Applications/Custom/mae_loss.cpp
Yes, you can. A devel package is always recommended, too, if you want to setup a CI/CD system.

eightreal commented 7 months ago

Another question. When I call the run interface and save the model, do I also save the current training status (such as gradient information)? Is it possible to continue training after the model is loaded in the future.

EunjuYang commented 7 months ago

Hello! Thank you for your question and concern. Here're my answer on your questions: First, you can save the model after training. However, it does not support to save the gradient information. Second, Yes. it is possible to continue training after the model is loaded.

myungjoo commented 7 months ago

You can do checkout and continue training process, but that's just not based on gradient saving. You can do epoch-based checkpointing (that's what most nntrainer's mobile applications do), but I'm not sure about finer-grained checkpointing.

eightreal commented 7 months ago

ok, thanks for your reply , another question , is there any method for a model copy and Polyak update?

myungjoo commented 7 months ago

For model copy, if there is no copy-constructor for model class and the default behavior does not do what you want, you may try "original.save()" and "cloned.load()".

For Polyak update, it appears that the DQN application (or simple "reinforcement learning" app) has its own "custom" op. But I'm not too sure about this. I guess @jijoongmoon may answer this when he returns from trip.

eightreal commented 6 months ago

Hello, I checked the reinforcement learning app , you update the net by save file and load file, but not polyak update , could you help check it ? And if if there is a impl of polyak update, could you help clear its path and code line?