mrdbourke / pytorch-deep-learning

Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
https://learnpytorch.io
MIT License
9.34k stars 2.78k forks source link

segmentation fault #978

Closed Luismbpr closed 2 weeks ago

Luismbpr commented 2 weeks ago

I have had this this issue when trying to run the model locally in terminal. I've had this issue when training and creating a model with more than 3 labels. I could solve it but did not know how I solved it or what was causing the problem.

Then I decided to create another model with more labels and I am experimenting the same issue again. I would prefer not to re-train the model again since it took a while.

When trying to run app.py on terminal (after connecting to the virtual environment) it displays this error:

`zsh: segmentation fault  python app.py`

Has anyone had this issue before?

OS: Mac OS Sonoma 14.5
Virtual environment(s): Several and not working on any.
Virtual Environment 01
Python 3.11.9
CUDA/cuDNN version: None
How you installed PyTorch (conda, pip, source): pip
device: CPU

gradio==4.36.1
torch==2.3.1
torchvision==0.18.1
Virtual Environment 02
Python 3.12.3
CUDA/cuDNN version: None
How you installed PyTorch (conda, pip, source): pip
device: CPU

gradio==4.33.0
torch==2.3.1
torchvision==0.18.1
Trained latest model on Colab with the following versions:
Virtual Environment 03
Python 3.10.12
CUDA/cuDNN version: None
How you installed PyTorch (conda, pip, source): pip
device: GPU (For training the model)

torch==2.3.0+cu121
torchvision==0.18.0+cu121

I am thinking since the model was trained on python 3.10.12, should I create a new virtual environment with that Python version?

Update: Ran the code on app.py and it seemed to work 'locally' with a few errors. 1) Will try to create the same python version on a virtual environment locally and try to run the model with it. 2) If it does not work I will create all the code on Colab.

Solution: Model was trained on Google Colab. Colab's python version is 3.10.12 Creating a new virtual environment with the model's training version was the solution.