Closed Watson52 closed 2 years ago
Hi, does the agent in the stage2 use the DAgger method like the LBC? Or just trained on the same dataset same as the privilege agent off-line?
Just trained on same dataset for this repo!
Hi, does the agent in the stage2 use the DAgger method like the LBC? Or just trained on the same dataset same as the privilege agent off-line?