Open KavitaHoude opened 10 months ago
Hello. Can you please provide the command that you are using?
hello Sir, this is the command !python train.py --model fasterrcnn_mobilenetv3_large_fpn --data data_configs/custom_data.yaml --epochs 10 --name fasterrcnn_mobilenetv3_large_fpn_noaug_40e --seed 42
On Wed, Jan 10, 2024 at 7:06 PM Sovit Ranjan Rath @.***> wrote:
Hello. Can you please provide the command that you are using?
— Reply to this email directly, view it on GitHub https://github.com/sovit-123/fasterrcnn-pytorch-training-pipeline/issues/122#issuecomment-1884865380, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5TGEHNRX6CYNX7H5F2B3VLYN2KOTAVCNFSM6AAAAABBUVAN3KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBUHA3DKMZYGA . You are receiving this because you authored the thread.Message ID: <sovit-123/fasterrcnn-pytorch-training-pipeline/issues/122/1884865380@ github.com>
Okay. If you are training on Colab and trying save on Google Drive, please use the --project-dir
argument instead of the --name
argument for saving the project.
Okay. If you are training on Colab and trying save on Google Drive, please use the
--project-dir
argument instead of the--name
argument for saving the project.
not worked. same error again
Can you please let me know where the code files are? Is it getting cloned to colab or is it somewhere on the Google Drive? It may not work If it is on Google Drive.
Can you please let me know where the code files are? Is it getting cloned to colab or is it somewhere on the Google Drive? It may not work If it is on Google Drive.
its cloned to google drive by using the command !git clone https://github.com/sovit-123/fasterrcnn-pytorch-training-pipeline.git
Most probably it won't run from Google Drive. Please try to clone to the colab drive directly and run it.
I am getting the following errors when trying to train the model on custom dataset. This error is getting at last epoch. Maybe it is model save error. Please give suggestions to solve these errors.
SAVING BEST MODEL FOR EPOCH: 10
Traceback (most recent call last): File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/train.py", line 571, in
main(args)
File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/train.py", line 566, in main
wandb_save_model(OUT_DIR)
File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/utils/logging.py", line 225, in wandb_save_model
wandb.save(os.path.join(model_dir, 'best_model.pth'))
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 371, in wrapper_fn
return func(self, *args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 361, in wrapper
return func(self, *args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1852, in save
return self._save(glob_str, base_path, policy)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1906, in _save
os.symlink(abs_path, wandb_path)
OSError: [Errno 95] Operation not supported: '/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/outputs/training/fasterrcnn_mobilenetv3_large_fpn_noaug_40e/best_model.pth' -> '/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/wandb/offline-run-20240110_114154-ui6uadqd/files/outputs/training/fasterrcnn_mobilenetv3_large_fpn_noaug_40e/best_model.pth'
Traceback (most recent call last):
File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/train.py", line 571, in
main(args)
File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/train.py", line 566, in main
wandb_save_model(OUT_DIR)
File "/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/utils/logging.py", line 225, in wandb_save_model
wandb.save(os.path.join(model_dir, 'best_model.pth'))
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 371, in wrapper_fn
return func(self, args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 361, in wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1852, in save
return self._save(glob_str, base_path, policy)
File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1906, in _save
os.symlink(abs_path, wandb_path)
OSError: [Errno 95] Operation not supported: '/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/outputs/training/fasterrcnn_mobilenetv3_large_fpn_noaug_40e/best_model.pth' -> '/content/drive/MyDrive/Tree_Detect_Faster_RCNN/fasterrcnn-pytorch-training-pipeline/wandb/offline-run-20240110_114154-ui6uadqd/files/outputs/training/fasterrcnn_mobilenetv3_large_fpn_noaug_40e/best_model.pth'