policy-gradient Search Results

1000+ results
for policy-gradient

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

XuhanLiu/DrugEx #5

AttributeError: 'int' object has no attribute 'predict_proba…

when run the agent.py , There was an error and I didn't debug it Could you give me some advice? Thank you Traceback (most recent call last): File "agent.py", line 159, in main() Fi…

a919224757 updated 5 years ago
2
hjsuh94/score_po #2

Baselines & Selling Points

1. Why Model-Based? - It's possible to be more data efficient although model-free might have better asymptotic performance - Models allow easily injecting inductive biases 2. What about other ge…

hjsuh94 updated 1 year ago
1
GuessWhatGame/guesswhat #26

NotFoundError when running train_qgen_reinforce.py

Hi, In train_qgen_reinforce.py code, when I try to restore qgen in line 163 I am given the following error: NotFoundError (see above for traceback): Key qgen/rl_baseline/baseline_hidden/W not found …

n-askarian updated 6 years ago
2
mlflow/mlflow #13575

[BUG] mlflow.exceptions.RestException: BAD_REQUEST: (psycopg…

### Issues Policy acknowledgement - [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md) ### Where…

mcerman updated 5 days ago
4
huggingface/trl #2022

Negative Entropy in TRL PPOv2Trainer TLDR Example

### System Info - `transformers` version: 4.44.0 - Platform: Linux-5.4.0-162-generic-x86_64-with-glibc2.31 - Python version: 3.11.9 - Huggingface_hub version: 0.23.4 - Safetensors version: 0.4.…

RylanSchaeffer updated 1 month ago
3
epignatelli/helx #55

Implement DeepRL agents

`Agent`s are entities with a `sample_action` and `update` method, in potence. We exclude from the list exploration strategies and curricula. _Implement_ means either to produce new code from the pape…

epignatelli updated 1 year ago
1
adventuresinML/adventures-in-ml-code #27

Policy Gradient Issue: ValueError: Shapes (20, 1) and (20, 2…

Hi. The code [Code](https://github.com/adventuresinML/adventures-in-ml-code/blob/master/policy_gradient_reinforce_tf2.py ) is not working with this line: `loss = network.train_on_batch(states, discou…

danisch-khurshid-creator updated 4 years ago
1
rlcode/reinforcement-learning #79

Pong Policy Gradient-important error in the definition of th…

I tried to run Pong Policy Gradient for 2000 episodes on the original file with no results whatsoever. Then boosted reward for positive points (points scored by the learner(right side) to 20 and got t…

TomaszRem updated 6 years ago
1
qingwen-guan/writeups #150

看Paper: A Policy Gradient Method with Variance Reduction for…

qingwen-guan updated 4 years ago
1
junxnone/tio #830

RL - PPO

# Reference - 07/2017 [Proximal policy optimization algorithms](https://arxiv.org/abs/1707.06347) # Brief - 基于策略梯度(PG，Policy Gradient)

junxnone updated 2 years ago
1

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for policy-gradient

1000+ results
for policy-gradient