fizyr / keras-retinanet

Keras implementation of RetinaNet object detection.
Apache License 2.0
4.38k stars 1.96k forks source link

Multiple values for keyword argument 'training' #1415

Closed divyanshmishra19 closed 4 years ago

divyanshmishra19 commented 4 years ago

Keep on getting this error when training on my own dataset: "TypeError: type object got multiple values for keyword argument 'training'"

I got this when I had multiple annotations for the same image in the csv files. After that, I modified my data set to have one annotation per image but I'm still getting this error. In the comments under the youtube tutorial for this, a lot of people got the same exact error but there appears to be no fix. Since I'm only starting out with ML, any help will be appreciated.

This is the code I ran to train my own model. !keras_retinanet/bin/train.py \ --freeze-backbone \ --random-transform \ --weights {PRETRAINED_MODEL} \ --batch-size 8 \ --steps 500 \ --epochs 10 \ csv "/content/drive/My Drive/Colab Notebooks/annotation.csv" "/content/drive/My Drive/Colab Notebooks/classes.csv"

Here is the full error: Creating model, this may take a second... 2020-07-12 08:50:45.446567: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-07-12 08:50:45.499292: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:45.499910: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla P100-PCIE-16GB computeCapability: 6.0 coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s 2020-07-12 08:50:45.500313: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-12 08:50:45.728251: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-07-12 08:50:45.864421: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-07-12 08:50:45.887211: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-07-12 08:50:46.177624: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-07-12 08:50:46.213348: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-07-12 08:50:46.740295: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-07-12 08:50:46.740491: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.741202: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.741736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-07-12 08:50:46.742272: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-07-12 08:50:46.748249: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2200000000 Hz 2020-07-12 08:50:46.748493: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1758bc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-12 08:50:46.748528: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-07-12 08:50:46.862641: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.863498: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1758d80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-07-12 08:50:46.863532: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0 2020-07-12 08:50:46.864576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.865145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla P100-PCIE-16GB computeCapability: 6.0 coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s 2020-07-12 08:50:46.865221: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-12 08:50:46.865272: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-07-12 08:50:46.865297: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-07-12 08:50:46.865322: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-07-12 08:50:46.865342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-07-12 08:50:46.865362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-07-12 08:50:46.865383: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-07-12 08:50:46.865457: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.866077: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.866667: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-07-12 08:50:46.866740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-12 08:50:46.868050: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-12 08:50:46.868080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2020-07-12 08:50:46.868091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2020-07-12 08:50:46.868220: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.868831: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2020-07-12 08:50:46.869359: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0. 2020-07-12 08:50:46.869418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15056 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) Traceback (most recent call last): File "keras_retinanet/bin/train.py", line 547, in <module> main() File "keras_retinanet/bin/train.py", line 507, in main config=args.config File "keras_retinanet/bin/train.py", line 117, in create_models model = model_with_weights(backbone_retinanet(num_classes, num_anchors=num_anchors, modifier=modifier, pyramid_levels=pyramid_levels), weights=weights, skip_mismatch=True) File "keras_retinanet/bin/../../keras_retinanet/models/resnet.py", line 38, in retinanet return resnet_retinanet(*args, backbone=self.backbone, **kwargs) File "keras_retinanet/bin/../../keras_retinanet/models/resnet.py", line 99, in resnet_retinanet resnet = keras_resnet.models.ResNet50(inputs, include_top=False, freeze_bn=True) File "/usr/local/lib/python3.6/dist-packages/keras_resnet/models/_2d.py", line 188, in ResNet50 return ResNet(inputs, blocks, numerical_names=numerical_names, block=keras_resnet.blocks.bottleneck_2d, include_top=include_top, classes=classes, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/keras_resnet/models/_2d.py", line 66, in ResNet x = keras_resnet.layers.BatchNormalization(axis=axis, epsilon=1e-5, freeze=freeze_bn, name="bn_conv1")(x) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in __call__ outputs = call_fn(cast_inputs, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py", line 265, in wrapper raise e.ag_error_metadata.to_exception(e) TypeError: in user code:

/usr/local/lib/python3.6/dist-packages/keras_resnet/layers/_batch_normalization.py:17 call  *
    return super(BatchNormalization, self).call(training=(not self.freeze), *args, **kwargs)

TypeError: type object got multiple values for keyword argument 'training'`
stvogel commented 4 years ago

I ran into the same error. It seemed that you have found the error on your side, @divyanshmishra19 Could you please leave a remark on what you found and how you solved it? Would be much appreciated.

divyanshmishra19 commented 4 years ago

Yeah, you need to downgrade the Keras version if I remember right. It works perfectly after that. I don’t remember the exact version but you should check in the solved issues section.

On Fri, 17 Jul 2020 at 11:21 PM Stefan notifications@github.com wrote:

I ran into the same error. It seemed that you have found the error on your side, @divyanshmishra19 https://github.com/divyanshmishra19 Could you please leave a remark on what you found and how you solved it? Would be much appreciated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1415#issuecomment-660253932, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANOPMP72N3P7FD2N62NF4N3R4CFT3ANCNFSM4OXWXB2Q .

stvogel commented 4 years ago

Your right it was the wrong version of Keras, there already some other issues referencing the same subject. pip install --upgrade keras==2.3.0 is it

divyanshmishra19 commented 4 years ago

Glad you were able to fix it

On Sat, 18 Jul 2020 at 6:22 PM Stefan notifications@github.com wrote:

Your right it was the wrong version of Keras, there already some other issues referencing the same subject. pip install --upgrade keras==2.3.0 is it

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1415#issuecomment-660478864, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANOPMPYB5JQ6UHOGBOU66JLR4GLKJANCNFSM4OXWXB2Q .