Error in Colab training

SweetStripes74 commented 3 years ago

I'm using your annotated data and training, not changing anything in the lines since I wanted to just check the steps but i'm not getting any results at all using your sample data. I'll get a results folder in my drive but nothing in it and I haven't changed any of the code so I'm uncertain as to why this error under step nine is occurring.

Frame will be saved in /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ extracting frames from video... processing /gdrive/sample_video.mp4 read failed!make sure that the video format is supported by cv2.VideoCapture 0% 0/300 [00:00<?, ?it/s]read frame failed! 0% 0/300 [00:00<?, ?it/s] getting demo image: CUDA_VISIBLE_DEVICES='0' python3 demo.py \ --nClasses 4 \ --indir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ \ --outdir /gdrive/result_folder \ --yolo_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights \ --yolo_model_cfg /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//cfg/yolov3-mice.cfg \ --pose_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe/exp/coco/Trial/model_10.pkl \ --use_boxGT 0 Loading YOLO model.. not using ground truth box to do the eval. Traceback (most recent call last): File "demo.py", line 60, in det_loader = DetectionLoader(data_loader, batchSize=args.detbatch,use_boxGT=args.use_boxGT,gt_json=args.gt_json).start() File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/dataloader.py", line 338, in init self.det_model.load_weights(opt.yolo_model_path) File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/yolo/darknet.py", line 407, in load_weights fp = open(weightfile, "rb") FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights'

tracking pose: python ./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py \ --imgdir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ \ --in_json /gdrive/result_folder/alphapose-results.json \ --out_json /gdrive/result_folder/alphapose-results-forvis-tracked.json \ --visdir /gdrive/result_folder/pose_track_vis/ --vis 1\ --image_format %s.png --max_pid_id_setting 2 --match 0 --weights 0 6 0 0 0 0 \ --out_video_path /gdrive/result_folder/Trial_2_0_060000.mp4
Traceback (most recent call last): File "./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py", line 215, in with open(notrack_json) as f: FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/result_folder/alphapose-results.json'

aneeshbal commented 3 years ago

Hi, thanks for reaching out again!

Can you confirm that under your My Drive folder in Google Drive, there is a video called sample_video.mp4.
Could you attach the terminal outputs for the train.py step. It appears that the YOLO model was not saved, so this may be another reason for the error

SweetStripes74 commented 3 years ago

1) Under the sample data folder on MyDrive is the sample video. I didn't see any instruction to extract that and put it into my base drive

2)I know this isn't what you're asking for but this is the terminal output of step 7 noting some other errors I saw as well

nvcc -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=[sm_50,compute_50] -gencode arch=compute_52,code=[sm_52,compute_52] -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ --compiler-options "-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DGPU" -c ./src/convolutional_kernels.cu -o obj/convolutional_kernels.o nvcc fatal : Unsupported gpu architecture 'compute_30' Makefile:92: recipe for target 'obj/convolutional_kernels.o' failed make: *** [obj/convolutional_kernels.o] Error 1 Collecting package metadata (current_repodata.json): done Solving environment: - The environment is inconsistent, please check the package plan carefully The following packages are causing the inconsistency:

pytorch/linux-64::pytorch==1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0
pytorch/linux-64::torchvision==0.5.0=py36_cu10done
Package Plan

environment location: /usr/local

added / updated specs:
- pytorch==1.4.0
- torchvision==0.5.0

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
ca-certificates-2021.1.19  |       h06a4308_1         118 KB
certifi-2020.12.5          |   py36h06a4308_0         140 KB
openssl-1.0.2u             |       h7b6447c_0         2.2 MB
pytorch-1.0.0              |py3.6_cuda9.0.176_cudnn7.4.1_1       498.6 MB  pytorch
torchvision-0.2.2          |             py_3          44 KB  pytorch
------------------------------------------------------------
                                       Total:       501.1 MB

The following packages will be REMOVED:

cudatoolkit-8.0-3

The following packages will be UPDATED:

ca-certificates 2019.1.23-0 --> 2021.1.19-h06a4308_1 certifi 2019.3.9-py36_0 --> 2020.12.5-py36h06a4308_0 openssl 1.0.2r-h7b6447c_0 --> 1.0.2u-h7b6447c_0

The following packages will be SUPERSEDED by a higher-priority channel:

torchvision pytorch/linux-64::torchvision-0.5.0-p~ --> pytorch/noarch::torchvision-0.2.2-py_3

The following packages will be DOWNGRADED:

pytorch 1.4.0-py3.6_cuda10.1.243_cudnn7.6.3_0 --> 1.0.0-py3.6_cuda9.0.176_cudnn7.4.1_1

Downloading and Extracting Packages pytorch-1.0.0 | 498.6 MB | : 100% 1.0/1 [01:30<00:00, 90.58s/it]
certifi-2020.12.5 | 140 KB | : 100% 1.0/1 [00:00<00:00, 6.52it/s] torchvision-0.2.2 | 44 KB | : 100% 1.0/1 [00:01<00:00, 1.12s/it]
openssl-1.0.2u | 2.2 MB | : 100% 1.0/1 [00:00<00:00, 5.99it/s] ca-certificates-2021 | 118 KB | : 100% 1.0/1 [00:00<00:00, 15.62it/s] Preparing transaction: done Verifying transaction: done Executing transaction: done

3) Here is the terminal output for Step 8

training detector train.sh: line 1: ./darknet: No such file or directory training finished.

SweetStripes74 commented 3 years ago

After retraining and tracking again with the video in the main file I believe I got the same exact error (posted below)

Frame will be saved in /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ extracting frames from video... processing /gdrive/sample_video.mp4 100% 300/300 [01:05<00:00, 4.89it/s] getting demo image: CUDA_VISIBLE_DEVICES='0' python3 demo.py \ --nClasses 4 \ --indir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ \ --outdir /gdrive/result_folder \ --yolo_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights \ --yolo_model_cfg /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//cfg/yolov3-mice.cfg \ --pose_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe/exp/coco/Trial/model_10.pkl \ --use_boxGT 0 Loading YOLO model.. not using ground truth box to do the eval. Traceback (most recent call last): File "demo.py", line 60, in det_loader = DetectionLoader(data_loader, batchSize=args.detbatch,use_boxGT=args.use_boxGT,gt_json=args.gt_json).start() File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/dataloader.py", line 338, in init self.det_model.load_weights(opt.yolo_model_path) File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/yolo/darknet.py", line 407, in load_weights fp = open(weightfile, "rb") FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights'

tracking pose: python ./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py \ --imgdir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/ \ --in_json /gdrive/result_folder/alphapose-results.json \ --out_json /gdrive/result_folder/alphapose-results-forvis-tracked.json \ --visdir /gdrive/result_folder/pose_track_vis/ --vis 1\ --image_format %s.png --max_pid_id_setting 2 --match 0 --weights 0 6 0 0 0 0 \ --out_video_path /gdrive/result_folder/Trial_2_0_060000.mp4
Traceback (most recent call last): File "./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py", line 215, in with open(notrack_json) as f: FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/result_folder/alphapose-results.json'

aneeshbal commented 3 years ago

I see the error now, it is primarily an error in the make step for YOLO. It appears that support for compute_30 has been removed in higher CUDA versions, so I will need to edit the code to adjust for that. I will let you know when I have an updated version ready.

Thanks!

SweetStripes74 commented 3 years ago

Gotcha; thank you!

ZexinChen / AlphaTracker

Error in Colab training #7

Package Plan