dusty-nv / jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
https://developer.nvidia.com/embedded/twodaystoademo
MIT License
7.75k stars 2.97k forks source link

RuntimeError: CUDA error: too many resources requested for launch #1828

Open maaaxac opened 5 months ago

maaaxac commented 5 months ago

hi @dusty-nv , in trying to run train_ssd.py with the open images (python3 open_images_downloader.py --max-images=500 --class-names "Apple,Orange,Banana,Strawberry,Grape,Pear,Pineapple,Watermelon" --data=data/fruit)

this is the output i get, can you tell whats wrong with it? thanks in advance

python3 train_ssd.py --data=data/fruit --model-dir=models/fruit --batch-size=1 --num-workers=1 --epochs=1 2024-04-15 10:05:38 - Using CUDA... 2024-04-15 10:05:38 - Namespace(balance_data=False, base_net=None, base_net_lr=0.001, batch_size=1, checkpoint_folder='models/fruit', dataset_type='open_images', datasets=['data/fruit'], debug_steps=10, extra_layers_lr=None, freeze_base_net=False, freeze_net=False, gamma=0.1, log_level='info', lr=0.01, mb2_width_mult=1.0, milestones='80,100', momentum=0.9, net='mb1-ssd', num_epochs=1, num_workers=1, pretrained_ssd='models/mobilenet-v1-ssd-mp-0_675.pth', resolution=300, resume=None, scheduler='cosine', t_max=100, use_cuda=True, validation_epochs=1, validation_mean_ap=False, weight_decay=0.0005) 2024-04-15 10:06:45 - model resolution 300x300 2024-04-15 10:06:45 - SSDSpec(feature_map_size=19, shrinkage=16, box_sizes=SSDBoxSizes(min=60, max=105), aspect_ratios=[2, 3]) 2024-04-15 10:06:45 - SSDSpec(feature_map_size=10, shrinkage=32, box_sizes=SSDBoxSizes(min=105, max=150), aspect_ratios=[2, 3]) 2024-04-15 10:06:45 - SSDSpec(feature_map_size=5, shrinkage=64, box_sizes=SSDBoxSizes(min=150, max=195), aspect_ratios=[2, 3]) 2024-04-15 10:06:45 - SSDSpec(feature_map_size=3, shrinkage=100, box_sizes=SSDBoxSizes(min=195, max=240), aspect_ratios=[2, 3]) 2024-04-15 10:06:45 - SSDSpec(feature_map_size=2, shrinkage=150, box_sizes=SSDBoxSizes(min=240, max=285), aspect_ratios=[2, 3]) 2024-04-15 10:06:45 - SSDSpec(feature_map_size=1, shrinkage=300, box_sizes=SSDBoxSizes(min=285, max=330), aspect_ratios=[2, 3]) 2024-04-15 10:06:51 - Prepare training datasets. 2024-04-15 10:06:51 - loading annotations from: data/fruit/sub-train-annotations-bbox.csv 2024-04-15 10:06:52 - annotations loaded from: data/fruit/sub-train-annotations-bbox.csv num images: 404 2024-04-15 10:06:54 - Dataset Summary:Number of Images: 404 Minimum Number of Images for a Class: -1 Label Distribution: Apple: 261 Banana: 113 Grape: 136 Orange: 599 Pear: 191 Pineapple: 47 Strawberry: 550 Watermelon: 50 2024-04-15 10:06:54 - Stored labels into file models/fruit/labels.txt. 2024-04-15 10:06:54 - Train dataset size: 404 2024-04-15 10:06:54 - Prepare Validation datasets. 2024-04-15 10:06:54 - loading annotations from: data/fruit/sub-test-annotations-bbox.csv 2024-04-15 10:06:54 - annotations loaded from: data/fruit/sub-test-annotations-bbox.csv num images: 73 2024-04-15 10:06:55 - Dataset Summary:Number of Images: 73 Minimum Number of Images for a Class: -1 Label Distribution: Apple: 11 Banana: 9 Grape: 21 Orange: 62 Pear: 6 Pineapple: 10 Strawberry: 73 Watermelon: 11 2024-04-15 10:06:55 - Validation dataset size: 73 2024-04-15 10:06:55 - Build network. 2024-04-15 10:06:58 - Init from pretrained SSD models/mobilenet-v1-ssd-mp-0_675.pth 2024-04-15 10:07:01 - Took 2.97 seconds to load the model. 2024-04-15 10:07:02 - Learning rate: 0.01, Base net learning rate: 0.001, Extra Layers learning rate: 0.01. 2024-04-15 10:07:02 - Uses CosineAnnealingLR scheduler. 2024-04-15 10:07:02 - Start training from epoch 0. /usr/local/lib/python3.6/dist-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "train_ssd.py", line 406, in train(train_loader, net, criterion, optimizer, device=DEVICE, debug_steps=args.debug_steps, epoch=epoch) File "train_ssd.py", line 149, in train loss.backward() File "/usr/local/lib/python3.6/dist-packages/torch/_tensor.py", line 255, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py", line 149, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: CUDA error: too many resources requested for launch CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.