philtabor / Youtube-Code-Repository

Repository for most of the code from my YouTube channel
873 stars 479 forks source link

Issue #55

Open Zaibali9999 opened 1 year ago

Zaibali9999 commented 1 year ago

(array([-0.02680779, 0.00466264, -0.02511859, -0.04842809], dtype=float32), {}) Traceback (most recent call last): File "main.py", line 31, in action, prob, val = agent.choose_action(observation) File "D:\AI\PPO\agent.py", line 41, in choose_action state = tf.convert_to_tensor([observation],dtype=tf.float32) File "C:\Users\Buster.conda\envs\PPO\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "C:\Users\Buster.conda\envs\PPO\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in convert_to_eager_tensor return ops.EagerTensor(value, ctx.device_name, dtype) ValueError: Can't convert non-rectangular Python sequence to Tensor.

Zaibali9999 commented 1 year ago

(array([ 0.0047165 , -0.04676152, -0.03735694, -0.0472385 ], dtype=float32), {}) D:\AI\PPO\torch\ppo_torch.py:137: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_new.cpp:233.) state = T.tensor([observation], dtype=T.float).to(self.actor.device) Traceback (most recent call last): File "D:\AI\PPO\torch\main.py", line 31, in action, prob, val = agent.choose_action(observation) File "D:\AI\PPO\torch\ppo_torch.py", line 137, in choose_action state = T.tensor([observation], dtype=T.float).to(self.actor.device) ValueError: expected sequence of length 4 at dim 2 (got 0)

Zaibali9999 commented 1 year ago

same issue with torch agent

rafayaamirgull commented 1 year ago

Hey Hi , I'm getting the same error:

state = T.tensor([observation], dtype=T.float).to(self.actor.device) Traceback (most recent call last): File "/home/rafay/RL/ReinforcementLearning/PolicyGradient/PPO/torch/main.py", line 33, in action, prob, val = agent.choose_action(observation) File "/home/rafay/RL/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py", line 142, in choose_action state = T.tensor([observation], dtype=T.float).to(self.actor.device) ValueError: expected sequence of length 4 at dim 2 (got 0)

@Zaibali9999 if you have solved the issue, can you please help me out? @philtabor please guide.

Thanks

tuan124816 commented 1 month ago

I'm having the same issue, I'm looking into the output type and the library itself since there might be some difference between versions. I will update my stat if I find something new Update: I found that I need to change the env.reset() to env.reset()[0] since it output a tuple and we need to access the NumPy array in it

(method) def reset( *, seed: int | None = None, options: dict[str, Any] | None = None ) -> tuple[Any, dict[str, Any]]