PWhiddy / PokemonRedExperiments

Playing Pokemon Red with Reinforcement Learning
MIT License
6.99k stars 644 forks source link

Windows 10 Random Crash After Unknown Amount of Attempts #168

Open luni-moon opened 9 months ago

luni-moon commented 9 months ago

Error Logs

Traceback (most recent call last):
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 35, in _worker
    observation, reward, terminated, truncated, info = env.step(data)
  File "F:\PKMNRED AI\baselines\red_gym_env.py", line 227, in step
    self.save_and_print_info(step_limit_reached, obs_memory)
  File "F:\PKMNRED AI\baselines\red_gym_env.py", line 404, in save_and_print_info
    plt.imsave(
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\pyplot.py", line 2200, in imsave
    return matplotlib.image.imsave(fname, arr, **kwargs)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\matplotlib\image.py", line 1689, in imsave
    image.save(fname, **pil_kwargs)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\PIL\Image.py", line 2429, in save
    fp = builtins.open(filename, "w+b")
PermissionError: [Errno 13] Permission denied: 'session_4c76892e\\curframe_87945d84.jpeg'
1404250  pyboy.pyboy                    INFO     ###########################
1404250  pyboy.pyboy                    INFO     # Emulator is turning off #
1404250  pyboy.pyboy                    INFO     ###########################
Traceback (most recent call last):
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\multiprocessing\connection.py", line 317, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] The pipe has been ended

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:\PKMNRED AI\baselines\run_baseline_parallel_fast.py", line 82, in <module>
    model.learn(total_timesteps=(ep_length)*num_cpu*5000, callback=CallbackList(callbacks))
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\ppo\ppo.py", line 308, in learn
    return super().learn(
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 259, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 178, in collect_rollouts
    new_obs, rewards, dones, infos = env.step(clipped_actions)
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py", line 197, in step
    return self.step_wait()
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\vec_env\vec_transpose.py", line 95, in step_wait
    observations, rewards, dones, infos = self.venv.step_wait()
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 130, in step_wait
    results = [remote.recv() for remote in self.remotes]
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 130, in <listcomp>
    results = [remote.recv() for remote in self.remotes]
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\multiprocessing\connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "C:\Users\Lukas\AppData\Local\Programs\Python\Python310\lib\multiprocessing\connection.py", line 326, in _recv_bytes
    raise EOFError
EOFError
PS F:\PKMNRED AI\baselines>

Does this happen because the training hit hard-coded constraints that ends the run? The command I used is python run_baseline_parallel_fast.py in the \baselines directory. This has happened twice consecutively, and I've only tried this twice.

My PC Specs:

Processor   AMD Ryzen 9 5950X 16-Core Processor 3.98 GHz
Installed RAM   128 GB
GPU   MSI VENTUS 3X PLUS OC GeForce RTX 3080 12GB LHR 12 GB Video Card
System type 64-bit operating system, x64-based processor
xinpw8 commented 9 months ago

It tried ( fp = builtins.open(filename, "w+b") ) to access the file 'session_4c76892e\curframe_87945d84.jpeg' but couldn't.

I'm just gonna have gpt type this: Directory Permissions: The directory where the script is trying to save the image file doesn't have write permissions for the user under which the Python script is running. This is common in environments with restricted user permissions or when directories are created by other users or systems.

File is Open: If the file or the directory it resides in is open in another program (for example, an image viewer or an editor locked the directory), the OS might prevent writing to it.

Path Does Not Exist: The directory session_4c76892e might not exist or might not be accessible due to permission issues or path errors.

Running as a Non-Administrator: If you're on Windows, running your script without administrator privileges might restrict access to certain paths.

How to Fix:

Check Directory Permissions: Ensure that the directory session_4c76892e exists and your user account has write permissions to it. You can check and modify permissions via the file properties dialog in Windows or using chmod on Unix-like systems.

Close Other Programs: Make sure no other programs are using the file or directory you're trying to write to.

Ensure Directory Exists: Before saving the file, ensure the directory exists. You can use Python's os or pathlib modules to check and create directories:

python

import os
dir_name = "session_4c76892e"
if not os.path.exists(dir_name):
    os.makedirs(dir_name)

Run as Administrator: Try running your script as an administrator to rule out permission issues.

Use Absolute Paths: Sometimes, using relative paths can lead to issues depending on where the script is executed from. Try specifying an absolute path to see if it resolves the issue.

File Path in open: The error is happening because you're trying to open a file in binary write mode ("w+b"), but the path points to an image file. Ensure you're using the correct path and that the path string is properly formatted.

If these steps don't solve the issue, you might want to double-check the path you're trying to write to and ensure there are no typos or incorrect path separators (\ for Windows, / for Unix-like systems).

If you need further help, feel free to ask in the discord help channel! 😊 https://discord.com/invite/rCVvDykK

luni-moon commented 9 months ago

It tried ( fp = builtins.open(filename, "w+b") ) to access the file

'session_4c76892e\curframe_87945d84.jpeg'

but couldn't.

I'm just gonna have gpt type this:

Directory Permissions: The directory where the script is trying to save the image file doesn't have write permissions for the user under which the Python script is running. This is common in environments with restricted user permissions or when directories are created by other users or systems.

File is Open: If the file or the directory it resides in is open in another program (for example, an image viewer or an editor locked the directory), the OS might prevent writing to it.

Path Does Not Exist: The directory session_4c76892e might not exist or might not be accessible due to permission issues or path errors.

Running as a Non-Administrator: If you're on Windows, running your script without administrator privileges might restrict access to certain paths.

How to Fix:

Check Directory Permissions: Ensure that the directory session_4c76892e exists and your user account has write permissions to it. You can check and modify permissions via the file properties dialog in Windows or using chmod on Unix-like systems.

Close Other Programs: Make sure no other programs are using the file or directory you're trying to write to.

Ensure Directory Exists: Before saving the file, ensure the directory exists. You can use Python's os or pathlib modules to check and create directories:

python

import os

dir_name = "session_4c76892e"

if not os.path.exists(dir_name):

    os.makedirs(dir_name)

Run as Administrator: Try running your script as an administrator to rule out permission issues.

Use Absolute Paths: Sometimes, using relative paths can lead to issues depending on where the script is executed from. Try specifying an absolute path to see if it resolves the issue.

File Path in open: The error is happening because you're trying to open a file in binary write mode ("w+b"), but the path points to an image file. Ensure you're using the correct path and that the path string is properly formatted.

If these steps don't solve the issue, you might want to double-check the path you're trying to write to and ensure there are no typos or incorrect path separators (\ for Windows, / for Unix-like systems).

If you need further help, feel free to ask in the discord help channel! 😊 https://discord.com/invite/rCVvDykK

This doesn't make sense as it ran for 5-10m prior, without fail, writing images and such, so I don't know. I didn't have any other programme that could interfere open, and the programme created the path, so it should definitely work. I'll join the discord when I get up. The only thing I can think of is it being on a non-C:\ drive.

PWhiddy commented 9 months ago

Hey! I have seen this before on windows. One potential cause is that if you have the image folder open in file explorer, the windows thumbnail generating process can lock the file for a short amount of time and sometimes cause that error. a few possible solutions- a. make sure that folder isn’t open while the training is running, b. wrap the file write in the env with a try/except to ignore the error, c. comment out / disable the screenshot file writing. hope that helps!