[BUG] Not saving Brain - Githubissues

do-ki commented 10 months ago

Describe the bug Pwnagotchi went to AI in just few minutes but upon checking it does not save the brain in /root directory

To Reproduce Steps to reproduce the behavior:

reboot and start pwnagotchi in Auto mode
wait until AI Mode
check /root directory if brain is already there

Expected behavior brain.nn and brain,json should be in /root directory

Screenshots

also found this error in logs

[2023-05-03 03:23:26,000] [INFO] [ai] learning for 50 epochs ...
[2023-05-03 03:23:26,032] [ERROR] [ai] error while training (could not broadcast input array from shape (428,) into shape (1,503))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pwnagotchi/ai/train.py", line 177, in _ai_worker
    self._model.learn(total_timesteps=epochs_per_episode, callback=self.on_ai_training_step)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 263, in learn
    rollout = self.runner.run(callback)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/base_class.py", line 794, in runner
    self._runner = self._make_runner()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 110, in _make_runner
    return A2CRunner(self.env, self, n_steps=self.n_steps, gamma=self.gamma)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 338, in __init__
    super(A2CRunner, self).__init__(env=env, model=model, n_steps=n_steps)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/runners.py", line 31, in __init__
    self.obs[:] = env.reset()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (428,) into shape (1,503)

Environment:

Pwnagotchi version: 1.7.51
Type of hardware: Raspberry Pi 3A+
Any additional hardware used: none

aluminum-ice commented 10 months ago

I don't see this behavior. I don't have an RPi3A+ so it might be specific to that board.

do-ki commented 10 months ago

will investigate more, this error seem the cause of not saving the brain.

[2023-05-04 13:06:51,433] [INFO] [ai] creating model ...
[2023-05-04 13:06:54,651] [INFO] [epoch 0] duration=00:00:48 slept_for=00:00:00 blind=0 sad=0 bored=0 inactive=1 active=0 peers=0 tot_bond=0.00 avg_bond=0.00 hops=0 missed=0 deauths=0 assocs=0 handshakes=0 cpu=25% mem=40% temperature=45C reward=-0.2
[2023-05-04 13:08:00,563] [INFO] [epoch 1] duration=00:01:05 slept_for=00:00:30 blind=1 sad=0 bored=0 inactive=2 active=0 peers=0 tot_bond=0.00 avg_bond=0.00 hops=0 missed=0 deauths=0 assocs=0 handshakes=0 cpu=26% mem=50% temperature=49C reward=-0.35
[2023-05-04 13:08:05,525] [INFO] [ai] model created:
[2023-05-04 13:08:05,526] [INFO]       gamma: 0.99
[2023-05-04 13:08:05,527] [INFO]       n_steps: 1
[2023-05-04 13:08:05,528] [INFO]       vf_coef: 0.25
[2023-05-04 13:08:05,528] [INFO]       ent_coef: 0.01
[2023-05-04 13:08:05,529] [INFO]       max_grad_norm: 0.5
[2023-05-04 13:08:05,530] [INFO]       learning_rate: 0.001
[2023-05-04 13:08:05,530] [INFO]       alpha: 0.99
[2023-05-04 13:08:05,531] [INFO]       epsilon: 1e-05
[2023-05-04 13:08:05,531] [INFO]       verbose: 1
[2023-05-04 13:08:05,532] [INFO]       lr_schedule: constant
[2023-05-04 13:08:06,662] [INFO] [ai] learning for 50 epochs ...
[2023-05-04 13:08:06,693] [ERROR] [ai] error while training (could not broadcast input array from shape (428,) into shape (1,503))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pwnagotchi/ai/train.py", line 177, in _ai_worker
    self._model.learn(total_timesteps=epochs_per_episode, callback=self.on_ai_training_step)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 263, in learn
    rollout = self.runner.run(callback)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/base_class.py", line 794, in runner
    self._runner = self._make_runner()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 110, in _make_runner
    return A2CRunner(self.env, self, n_steps=self.n_steps, gamma=self.gamma)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 338, in __init__
    super(A2CRunner, self).__init__(env=env, model=model, n_steps=n_steps)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/runners.py", line 31, in __init__
    self.obs[:] = env.reset()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (428,) into shape (1,503)

do-ki commented 10 months ago

same error showing up and not saving the brain on version 1.7.6. Hardware used Raspberry Pi 3A+ (Same wifi chip with 3B+)

[2023-05-03 03:21:15,465] [INFO] waiting for 10s on channel 6 ...
[2023-05-03 03:21:25,630] [INFO] CHANNEL 1
[2023-05-03 03:21:25,688] [INFO] 3 access points on channel 1
[2023-05-03 03:21:25,833] [INFO] sending association frame to <hidden> (c6:70:ab:d8:47:0c ) on channel 1 [0 clients], -70 dBm...
[2023-05-03 03:21:26,033] [INFO] [ai] model created:
[2023-05-03 03:21:26,039] [INFO]       gamma: 0.99
[2023-05-03 03:21:26,045] [INFO]       n_steps: 1
[2023-05-03 03:21:26,051] [INFO]       vf_coef: 0.25
[2023-05-03 03:21:26,057] [INFO]       ent_coef: 0.01
[2023-05-03 03:21:26,063] [INFO]       max_grad_norm: 0.5
[2023-05-03 03:21:26,069] [INFO]       learning_rate: 0.001
[2023-05-03 03:21:26,075] [INFO]       alpha: 0.99
[2023-05-03 03:21:26,081] [INFO]       epsilon: 1e-05
[2023-05-03 03:21:26,087] [INFO]       verbose: 1
[2023-05-03 03:21:26,093] [INFO]       lr_schedule: constant
[2023-05-03 03:21:26,263] [INFO] [ai] learning for 50 epochs ...
[2023-05-03 03:21:26,290] [ERROR] [ai] error while training (could not broadcast input array from shape (428,) into shape (1,503))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pwnagotchi/ai/train.py", line 177, in _ai_worker
    self._model.learn(total_timesteps=epochs_per_episode, callback=self.on_ai_training_step)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 263, in learn
    rollout = self.runner.run(callback)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/base_class.py", line 794, in runner
    self._runner = self._make_runner()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 110, in _make_runner
    return A2CRunner(self.env, self, n_steps=self.n_steps, gamma=self.gamma)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 338, in __init__
    super(A2CRunner, self).__init__(env=env, model=model, n_steps=n_steps)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/runners.py", line 31, in __init__
    self.obs[:] = env.reset()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (428,) into shape (1,503)

C4rdsh4rk commented 10 months ago

I have the same error on the pi zero 2. I checked the brain.nn and brain.json were both there, rebooted and it went into auto mode. The pwnlog shows it creates an ai and overwrites the already trained version with a new one, starting training from scratch.

Also "sudo touch /root/.pwnagotchi-auto" doesn't work. The file is gone after a reboot... I used a brand new sd card to test

aluminum-ice commented 10 months ago

I have the same error on the pi zero 2. I checked the brain.nn and brain.json were both there, rebooted and it went into auto mode. The pwnlog shows it creates an ai and overwrites the already trained version with a new one, starting training from scratch.

Also "sudo touch /root/.pwnagotchi-auto" doesn't work. The file is gone after a reboot... I used a brand new sd card to test

It does work for me. If you look at the code that controls going into auto mode, it has multiple items it checks. One of those is the presence of that file, it if finds it, it is deleted but will still boot into AUTO next time. I consistently see my 2W boot into AUTO once I create the file.

C4rdsh4rk commented 10 months ago

But the log still says it creates a new nn file, even though there is a file present. I will try to collect the installation steps to recreate this problem

do-ki commented 10 months ago

Just tested 1.7.7 and still encounter the same error also brain.nn & brain.json is not saving in /root directory. I think this is specific to 3A+ model since I don't encounter this on Zero 2W, I have 3B+ laying around I'll try later if I encounter this error in that board, they are technically same Wifi chip.

MARPATdroid commented 5 months ago

Same bug presents the exact same way on Pi4... I have it on a Pi3B running the build right now to see what it's output is. Will likely update later if issue presents there.


[2024-02-11 19:14:13,525] [INFO] deleting /root/.pwnagotchi-recovery
[2024-02-11 19:14:13,532] [INFO] creating new websocket...
[2024-02-11 19:14:13,658] [INFO] [epoch 214] duration=00:00:00 slept_for=00:00:00 blind=0 sad=0 bored=0 inactive=1 active=0 peers=0 tot_bond=0.00 avg_bond=0.00 hops=0 missed=0 deauths=0 assocs=0 handshakes=0 cpu=20% mem=0% temperature=39C reward=-0.0009302325581395349
[2024-02-11 19:14:29,444] [INFO] [ai] creating model ...
[2024-02-11 19:14:57,906] [INFO] [epoch 215] duration=00:00:44 slept_for=00:00:30 blind=0 sad=0 bored=0 inactive=2 active=0 peers=0 tot_bond=0.00 avg_bond=0.00 hops=6 missed=0 deauths=0 assocs=0 handshakes=0 cpu=34% mem=10% temperature=39C reward=0.002433862433862434
[2024-02-11 19:15:25,089] [INFO] [ai] model created:
[2024-02-11 19:15:25,089] [INFO]       gamma: 0.99
[2024-02-11 19:15:25,089] [INFO]       n_steps: 1
[2024-02-11 19:15:25,089] [INFO]       vf_coef: 0.25
[2024-02-11 19:15:25,089] [INFO]       ent_coef: 0.01
[2024-02-11 19:15:25,090] [INFO]       max_grad_norm: 0.5
[2024-02-11 19:15:25,090] [INFO]       learning_rate: 0.001
[2024-02-11 19:15:25,090] [INFO]       alpha: 0.99
[2024-02-11 19:15:25,090] [INFO]       epsilon: 1e-05
[2024-02-11 19:15:25,090] [INFO]       verbose: 1
[2024-02-11 19:15:25,090] [INFO]       lr_schedule: constant
[2024-02-11 19:15:25,103] [INFO] [ai] learning for 50 epochs ...
[2024-02-11 19:15:25,105] [ERROR] [ai] error while training (could not broadcast input array from shape (428,) into shape (1,503))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pwnagotchi/ai/train.py", line 177, in _ai_worker
    self._model.learn(total_timesteps=epochs_per_episode, callback=self.on_ai_training_step)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 263, in learn
    rollout = self.runner.run(callback)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/base_class.py", line 794, in runner
    self._runner = self._make_runner()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 110, in _make_runner
    return A2CRunner(self.env, self, n_steps=self.n_steps, gamma=self.gamma)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 338, in __init__
    super(A2CRunner, self).__init__(env=env, model=model, n_steps=n_steps)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/runners.py", line 31, in __init__
    self.obs[:] = env.reset()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (428,) into shape (1,503)
[2024-02-11 19:15:58,454] [INFO] [epoch 216] duration=00:01:00 slept_for=00:01:00 blind=0 sad=0 bored=0 inactive=3 active=0 peers=0 tot_bond=0.00 avg_bond=0.00 hops=7 missed=0 deauths=0 assocs=0 handshakes=0 cpu=0% mem=10% temperature=38C reward=0.002235023041474655

widhalmt commented 5 months ago

I have a RPi3 and I got a similar output:

[2024-02-15 15:49:08,240] [ERROR] [ai] error while training (could not broadcast input array from shape (428,) into shape (1,503))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pwnagotchi/ai/train.py", line 177, in _ai_worker
    self._model.learn(total_timesteps=epochs_per_episode, callback=self.on_ai_training_step)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 263, in learn
    rollout = self.runner.run(callback)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/base_class.py", line 794, in runner
    self._runner = self._make_runner()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 110, in _make_runner
    return A2CRunner(self.env, self, n_steps=self.n_steps, gamma=self.gamma)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/a2c/a2c.py", line 338, in __init__
    super(A2CRunner, self).__init__(env=env, model=model, n_steps=n_steps)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/runners.py", line 31, in __init__
    self.obs[:] = env.reset()
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 62, in reset
    self._save_obs(env_idx, obs)
  File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 92, in _save_obs
    self.buf_obs[key][env_idx] = obs
ValueError: could not broadcast input array from shape (428,) into shape (1,503)

widhalmt commented 5 months ago

I dug a bit more about that error. Some people suspected that it's due to the fact that some RPi3 support 5GHz and the extra channels break the AI calculation.

There were some ideas about missing dependencies (not enforced while installing) or version mismatchs. I tried all I could find but to no success. I still have the same error.

aluminum-ice / pwnagotchi

[BUG] Not saving Brain #27