rew Search Results - Githubissues

1000+ results
for rew

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

liuzuxin/cvpo-safe-rl #4

Found one wrong coding place

Hello, i think this place in your codes is wrong. cvpo-safe-rl/safe_rl/policy/cvpo.py line-454-457 def critic_loss(): obs, act, reward, obs_next, done = to_tensor(data['obs']), to_tens…

minoshiro1 updated 1 year ago
1
HumanCompatibleAI/imitation #575

Infinite-Horizon Environments not Supported

## Bug description There are two ways to deal with variable horizon environments. 1) is by turning them into fixed length environments and 2) is by turning them into infinite horizon environments. …

stewy33 updated 1 year ago
6
thu-ml/tianshou #756

How do i make changes to a batch that has already been in a …

let's say i want to replace the reward of the last batch in the buffer, i tried: ``` buffer[len(buffer)-1].rew = new_reward ``` but it turns out the change didn't apply at all, maybe i did it wron…

P1G3s updated 1 year ago
2
coq/coq #3225

More tactics should be documented

Note: the issue was created automatically with bugzilla2github tool Original bug ID: BZ#3225 From: @JasonGross Reported version: 8.5 CC: @cpitclaudel, @forestjulien, @Matafou See also: #455…

coqbot updated 1 year ago
10
AxelEnghamre/FIFA23 #4

create shots

AxelEnghamre updated 1 year ago
4
thu-ml/tianshou #453

Nested observation encounter TypeError: Object in Batch has …

```python Epoch #1: 0%| | 1/5000 [00:00

HandsomeAIccx updated 1 year ago
9
thu-ml/tianshou #549

ReplayBuffer.update does not change stats while adding data

- [X] I have marked all applicable categories: + [ ] exception-raising bug + [X] RL algorithm bug + [ ] documentation request (i.e. "X is missing from the documentation.") + [ ] ne…

Jimenius updated 1 year ago
1
huawei-noah/SMARTS #1663

Early termination of training

**High Level Description** Want to determine what is the cause of early termination of the training process **Desired SMARTS version** 0.6.1 **Operating System** Ubuntu 20.04.5 LTS **Prob…

zyzhang1130 updated 1 year ago
6
DLR-RM/stable-baselines3 #1165

[Question] What is the real intention for reward scaling wit…

### ❓ Question It confuses me a lot that using statistics of discounted rewards to rescale another quantity - reward. That seems a default choice for PPO. Is there any intuition for interpreting th…

MagiFeeney updated 1 year ago
2
Sergei-Korneev/Joplin-to-Obsidian-md #1

How to use it

![cmd-20220904-1550-rp9](https://user-images.githubusercontent.com/9929511/188303314-f50a5cb1-8ccf-476c-97e2-eae92d3a7ea2.png) ![Uploading explorer-20220904-1552-REw.png…]()

abe520 updated 2 years ago
3

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for rew

1000+ results
for rew