gail-ppo Search Results

178 results
for gail-ppo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ikostrikov/pytorch-a2c-ppo-acktr-gail #206

Recurrent states not reset between episode boundaries

The policy is given the last recurrent state from the replay buffer and isn't reset between episode boundaries. In my case I have the number of updates set to the episode length, so I've added `rollou…

bamos updated 4 years ago
3
ikostrikov/pytorch-a2c-ppo-acktr-gail #204

I couldn't get good result for GAIL in any environments exce…

Hi, first of all, thank you for sharing your code. I've been trying to implement GAIL using expert demonstrations from your Google Drive. I used the hyper-parameters from gail_experts/readme and I …

slee01 updated 4 years ago
3
ikostrikov/pytorch-a2c-ppo-acktr-gail #238

FPS calculation

While training, the number of frames used so far is computer as `total_num_steps = (j + 1) * args.num_processes * args.num_steps` Shouldn't this be multiplied by the number of stacked frames (def…

Xemnas0 updated 4 years ago
3
reinforcement-learning-kr/lets-do-irl #6

ppo save expert demo

hi, how am i supposed to save expert demo in ppo main?

francisduan updated 2 years ago
1
shamilmamedov/flexible_arm #27

Tuning of the RL and IRL algorithms

**Description**: The RL and IRL algorithms need tuning to perform well (especially the Adversarial ones). We need to put some time and tune them and see if they can perform well if we want to use the…

Erfi updated 11 months ago
5
epogrebnyak/mlmw #6

Reorganize beginner section

Updates from: - https://github.com/jacobhilton/deep_learning_curriculum (focus on transformers) - Raschka book 1. Math prerequisites Taking a derivative to find a point of minimum or maxim…

epogrebnyak updated 2 months ago
3
openai/spinningup #214

[PyTorch] Is it faster to cast to tensor when inserting into…

I noticed across many of the implementations of actor-critic policies, the Rollout/Buffer/Trajectories object is inconsistent, in that some authors send the arrays to device as tensors during insertio…

langfield updated 4 years ago
2
cosmicBboy/ml-research #26

[metalearn] neurips bbo challenge idea dump

Noting these down for the [neurips bbo challenge](http://bbochallenge.com/leaderboard) - idea 1: generate more suggestions and only send the top `n_suggestions` ranked by value. - idea 2: gener…

cosmicBboy updated 3 years ago
4
GAIL-4-BARK/bark-ml #12

Task 1.3: Understand Bark

#2 Goal: - Extract expert trajectories from PPO/etc. / From Interaction dataset - Build / debug system - Testing environment - Do tutorials - https://bark-simulator.readthedocs.io/en/latest/a…

MarcelBruckner updated 4 years ago
2
HumanCompatibleAI/imitation #836

RewardNetwork predict_processed doesn't work without next_st…

## Bug description RewardNet `predict_processed` method only works using `state, action, next_state and done` attributes, despite trained using only `state, action`. For example, the [BasicRewardN…

gustavodemari updated 3 months ago
1

上一页 1...1 2 3 4 5 6 7...18 下一页

178 results for gail-ppo

178 results
for gail-ppo