Open SalehShmali opened 3 years ago
Are you doing this without modifying anything or are you trying to implement your own code?
without modifying anything
Probably tf version mismatch, try tf 2.0 not others.
my tf version is 2.3.1 is that mismatch and i am working with cpu not gpu??
Yes this doesn’t work in anything besides tf 2.0
Get Outlook for iOShttps://aka.ms/o0ukef
From: Saleh Shmali notifications@github.com Sent: Monday, December 14, 2020 4:37:56 AM To: FurkanOM/tf-faster-rcnn tf-faster-rcnn@noreply.github.com Cc: Oygenblik, David davido@gatech.edu; Comment comment@noreply.github.com Subject: Re: [FurkanOM/tf-faster-rcnn] Erorr in reg_loss (#14)
my tf version is 2.3.1 is that mismatch and i am working with cpu not gpu??
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/FurkanOM/tf-faster-rcnn/issues/14#issuecomment-744312653, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANM2UXYW3ZTHEN3VHGQB75DSUXMHJANCNFSM4UZLRQAA.
Also you’re gonna have a hell of a long time training it on cpu, on my powerful gpu training on my custom data set takes roughly a day and a half
Get Outlook for iOShttps://aka.ms/o0ukef
From: Oygenblik, David davido@gatech.edu Sent: Monday, December 14, 2020 5:19:46 AM To: FurkanOM/tf-faster-rcnn reply@reply.github.com; FurkanOM/tf-faster-rcnn tf-faster-rcnn@noreply.github.com Cc: Comment comment@noreply.github.com Subject: Re: [FurkanOM/tf-faster-rcnn] Erorr in reg_loss (#14)
Yes this doesn’t work in anything besides tf 2.0
Get Outlook for iOShttps://aka.ms/o0ukef
From: Saleh Shmali notifications@github.com Sent: Monday, December 14, 2020 4:37:56 AM To: FurkanOM/tf-faster-rcnn tf-faster-rcnn@noreply.github.com Cc: Oygenblik, David davido@gatech.edu; Comment comment@noreply.github.com Subject: Re: [FurkanOM/tf-faster-rcnn] Erorr in reg_loss (#14)
my tf version is 2.3.1 is that mismatch and i am working with cpu not gpu??
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/FurkanOM/tf-faster-rcnn/issues/14#issuecomment-744312653, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANM2UXYW3ZTHEN3VHGQB75DSUXMHJANCNFSM4UZLRQAA.
i tried on tf2.0 it was aproblems on tensorflow datasets like this : google.protobuf.json_format.ParseError: Message type "tensorflow_datasets.DatasetInfo" has no field named "downloadSize". Available Fields(except extensions): ['name', 'description', 'version', 'citation', 'sizeInBytes', 'location', 'downloadChecksums', 'schema', 'splits', 'supervisedKeys', 'redistributionInfo']
what version of tfds is suitable?
Hi, Using a batch size of 1 I resolved with this issue. I'm looking to make it work with batch size > 1. I'll let you know if I figured out how to make it work for batch size > 1. I'm using tf 2.4.
The batch_size problem with tf 2.4 is the result of a change in the huber loss function: https://github.com/tensorflow/tensorflow/blob/v2.4.0/tensorflow/python/keras/losses.py#L1426 https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/python/keras/losses.py#L885
tf.losses.Huber(reduction=tf.losses.Reduction.NONE)(tf.ones(shape=(2,16500,4)),tf.ones(shape=(2,16500,4))).shape TensorShape([2, 16500, 4])
tf.losses.Huber(reduction=tf.losses.Reduction.NONE)(tf.ones(shape=(2,16500,4)),tf.ones(shape=(2,16500,4))).shape TensorShape([2, 16500])
Fixing the bug for tf 2.4 is easy: Remove the additional reduce_sum in the reg_loss in line 218. https://github.com/FurkanOM/tf-faster-rcnn/blob/master/utils/train_utils.py#L218
Anyone know with Tensorflow 2.8.0, seems the error now change to
ValueError: in user code: File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function * return step_function(self, iterator) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step ** outputs = model.train_step(data) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 861, in train_step self._validate_target_and_loss(y, loss) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 819, in _validate_target_and_loss 'Target data is missing. Your model was compiled with ' ValueError: Target data is missing. Your model was compiled with loss=ListWrapper([None, None, None, None, None, None, None, None, None]), and therefore expects target data to be provided in 'fit()'
Hi @ruman1609, Try the solution I proposed in #17 by changing the format of inputs and targets. It seems you don't give target data to the fit function. Otherwise, downgrade your tf version.
Hi @colindecourt I already do like you do, and like @wr0112358 did. Still the error occurred so right now I using 2.1.0 version. The error log was just like I sent before is using TF 2.8.0. That's why I downgrade my TF to 2.1.0
1- AssertionError: Could not compute output Tensor("rpn_reg_loss/truediv:0", shape=(), dtype=float32) in faster rcnn 2-Incompatible shapes: [8,9216] vs. [8] [[node gradient_tape/reg_loss/mul/BroadcastGradientArgs (defined at e:/faster_rcnn/rpn_trainer.py:60) ]] [Op:__inference_train_function_12539] in rpn alone