Open NielsRogge opened 1 year ago
Potentially relevant to the following issues:
cool !
Hi,
would it be possible to obtain a detailed script to train from a pre-trained model? as following the tutorial used in DETR causes problems.
as following the tutorial used in DETR causes problems.
Can you clarify which issues you had?
I have changed DetrForObjectDetection to DeformableDetrForObjectDetection in the DETR class and I have also changed the DetrFeatureExtractor to AutoFeatureExtractor. Finally I define the batch size as 1 in the dataloaders and the training runs correctly but when evaluating I get the following error:
IndexError: max(): Expected reduction dim 2 to have non-zero size.
Looking at the inference script I see that the AutoFeatureExtractor is used to get the input and pass it to the model, because of this I am not very clear. Prueba_train_deformable_detr.zip
I attach the notebook I am using so that you can see it easier.
Is there any script to draw attention map of deformable detr in powerful HuggingFace?
Is there any script to draw attention map of deformable detr in powerful HuggingFace?
You can just follow this notebook where I show how to visualize attention maps of the decoder. Make sure to replace DetrForObjectDetection
by DeformableDetrForObjectDetection
.
Is there any script to draw attention map of deformable detr in powerful HuggingFace?
You can just follow this notebook where I show how to visualize attention maps of the decoder. Make sure to replace
DetrForObjectDetection
byDeformableDetrForObjectDetection
.
I think the way to draw attention map of deformble detr is much different from detr. Since it uses reference points and sampling offsets.
Hi, I'm new to hugging face so I might be missing something obvious but when I try to import DeformableDetrForObjectDetection
from transformer (I've checked and I have the latest version) I get a an ImportError.
Also the feature extractor doesn't work as feature_extractor_class_from_name('DeformableDetrFeatureExtractor')
so that feature_extractor = AutoFeatureExtractor.from_pretrained("SenseTime/deformable-detr")
fails with AttributeError: 'NoneType' object has no attribute 'from_dict'
Solved it by installing with pip install -q git+https://github.com/huggingface/transformers.git
instead of pypi.
Hi,
Thanks for reporting. We indeed fixed Deformable DETR's feature extractor as seen in #19140. It will be included in the next PyPi release.
Alright, good to hear! By the way kudos for the great work.
One last question, do you know if anyone has tried to convert the model to TensorRT? I was able to export an onnx using opset=16
but then TensorRT doesn't have any implementation of GridSample
and I could not proceed furhter.
@NielsRogge Hi and thanks for the great work, While implementing the code for finetuning DeformableDETR on my dataset, I realized that len(train_dataset) and len(val_dataset) is smaller than the real number of training and val files. The dataset is fine since I have successfully fine-tuned DETR on it but I'm guessing the issue arises when I use DeformableDetrFeatureExtractor. Would you by any chance know of a reason why this happens?
Hi,
Hmm normally that shouldn't be changed because of the feature extractor. Did you create a regular PyTorch dataset?
Yes, It's basically just a number of images and their annotations. I have tested the annotations file via multiple coco-viewers and also the same thing did not happen while fine-tuning DETR with the same dataset.
onnx using
opset=16
Were you using the transformers library? I am trying to export to onnx but it results in an error. Could you please share your script if you can?
have you tried to export the model into torchscript format to a C++ environment for inference? i tried,but the exported model doesn't work,so l'd like to find some help from you! I had some problems converting and exporting the model, but I finally got the model through the trace method, but there was a problem with this model. RuntimeError: The size of tensor a (32) must match the size of tensor b (237) at non-singleton dimension 1 i need your help!
Could it be that you traced the model with fixed batch size of 237 and you are trying to run inference with a batch of size 32? @Zalways
Could it be that you traced the model with fixed batch size of 237 and you are trying to run inference with a batch of size 32? @Zalways
thank u for your reply! it seems not this reason cause the error, i tried input a tensor in specific shape,it still error. i wondering whether the model is exported correctlly,(but i donn't know how to check my exported model is right,) my current exported model just can inference the image i used for export model,for other image or tensor,it shows the error infomation like:The size of tensor a (xxx) must match the size of tensor b (xxx) at non-singleton dimension 0; i'v been confused for a longlong time.
my current exported model just can inference the image i used for export model
are the other images the same shape as the one you used for tracing?
@andrearosasco yes , when i use other image,image has been preprocessed as the same, and i tried to input the same shape tensor into the model,it doesn't work! it's very strange, now i have no idea about it .i hope someone could help me with this issue
when i try to export the deformable detr model into torchscript,it shows the error message!
Could not export Python function call 'MSDeformAttnFunction'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to constants:
/root/autodl-tmp/project/deepsolo/adet/layers/ms_deform_attn.py(165): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/deepsolo/adet/layers/deformable_transformer.py(286): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/deepsolo/adet/layers/deformable_transformer.py(413): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/deepsolo/adet/layers/deformable_transformer.py(173): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/deepsolo/adet/modeling/model/detection_transformer.py(200): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/deepsolo/adet/modeling/text_spotter_v1.py(222): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/export/flatten.py(259):
anybody knows how to solve it?
i tried the method:
it exported successfully,but the exported model doesn't work! and when i use this exported model to inference,it shows the error message:
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py:1051: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return forward_call(*input, *kwargs)
Traceback (most recent call last):
File "/root/autodl-tmp/project/deploy/export_model.py", line 264, in
return (_0, _1, _2, _3, _4, _5, _6)
File "code/__torch__/adet/modeling/text_spotter.py", line 23, in forward
batched_imgs = torch.unsqueeze_(_7, 0)
x0 = torch.contiguous(batched_imgs)
_8, _9, _10, _11, = (_0).forward(x0, image_size, )
~~~~~~~~~~~ <--- HERE
_12 = torch.softmax(_9, -1)
prob = torch.sigmoid(torch.mean(_8, [-2]))
File "code/__torch__/adet/modeling/model/detection_transformer.py", line 50, in forward
_29 = getattr(self.input_proj, "1")
_30 = getattr(self.input_proj, "0")
_31 = (self.backbone).forward(x, image_size, )
~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_32, _33, _34, _35, _36, _37, _38, _39, _40, _41, _42, _43, _44, _45, _46, _47, _48, _49, _50, _51, _52, _53, _54, _55, _56, _57, = _31
_58 = (_30).forward(_32, )
File "code/__torch__/adet/modeling/text_spotter.py", line 104, in forward
image_size: Tensor) -> Tuple[Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, Tensor]:
_61 = getattr(self, "1")
_62 = (getattr(self, "0")).forward(x, image_size, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_63, _64, _65, _66, _67, _68, _69, = _62
pos_embed = torch.to((_61).forward(_63, ), 6)
File "code/__torch__/adet/modeling/text_spotter.py", line 143, in forward
_92 = torch.slice(torch.slice(_91, 0, 0, 125), 1, 0, 138)
_93 = torch.view(CONSTANTS.c2, annotate(List[int], []))
_94 = torch.copy_(_92, torch.expand(_93, [125, 138]))
~~~~~~~~~~~ <--- HERE
masks_per_feature_level0 = torch.ones([_85, _86, _87], dtype=11, layout=None, device=torch.device("cpu"), pin_memory=False)
_95 = torch.select(masks_per_feature_level0, 0, 0)
Traceback of TorchScript, original code (most recent call last):
/root/autodl-tmp/project/adet/modeling/text_spotter.py(60): mask_out_padding
/root/autodl-tmp/project/adet/modeling/text_spotter.py(43): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/text_spotter.py(21): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/model/detection_transformer.py(168): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/autodl-tmp/project/adet/modeling/text_spotter.py(220): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/export/flatten.py(259): <lambda>
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/export/flatten.py(294): forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1039): _slow_forward
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/nn/modules/module.py(1051): _call_impl
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/jit/_trace.py(952): trace_module
/root/miniconda3/envs/deepsolo/lib/python3.8/site-packages/torch/jit/_trace.py(735): trace
/root/autodl-tmp/project/deploy/export_model.py(125): export_tracing
/root/autodl-tmp/project/deploy/export_model.py(224): <module>
/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py(18): execfile
/root/.pycharm_helpers/pydev/pydevd.py(1496): _exec
/root/.pycharm_helpers/pydev/pydevd.py(1489): run
/root/.pycharm_helpers/pydev/pydevd.py(2177): main
/root/.pycharm_helpers/pydev/pydevd.py(2195): <module>
RuntimeError: The size of tensor a (50) must match the size of tensor b (125) at non-singleton dimension 0
so i think the problem maybe occurs in export step :Could not export Python function call 'MSDeformAttnFunction'
looking forward to your reply!
@NielsRogge
I've been training the huggingface models for DetrForObjectDetection and DeformableDetrForObjectDetection with PyTorch lightning. I saw that Detr trains like 5times faster (in terms of batch processing) than deformable detr.
Is this expected behaviour?
Hmm pinging @qubvel here. He just added official example scripts for object detection: https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection. Works with DETR, Deformable DETR among other models.
Hi @JannikZgraggenTR Also observed similar behavior, I would say for Deformable Detr it takes 3x more time to process a batch during training. But it converges faster and to better optimum. Both models have trained 100 epochs on cppe-5 dataset, X-axis is the time here.
You can replicate results with examples from HF, but it uses Trainer and Accelerate, not Lightning
@qubvel were you leveraging the custom CUDA kernel for the deformable attention operator?
Not sure I did it unless it is used by default. Is there any reference on how to enable it?
It looks like the kernels are already enabled by default: https://github.com/huggingface/transformers/blob/3802e786ef64b13bef5e8669dcb96e291d2c5317/src/transformers/models/deformable_detr/configuration_deformable_detr.py#L195
Just checked trained model config, disable_custom_kernels
is false
In terms of FLOPs Deformable DETR paper reports ~2x compared to DETR, but less training time due to faster convergence
@qubvel thanks for the great insight, around 2x is now observed by me as well. version_162 is normal detr (8.8 seconds) version_166 is deform_detr (18.4 seconds) version_165 is deform detr with two stage and box refinement (58.8 seconds)
from the table 1 in conditional DETR paper iterative box refinement should however not add computational cost? (@qubvel)
@NielsRogge thanks a lot for your response, I have disable_custom_kernels = False for the deformable DETR models. I originally became interested in deformable DETR because I saw that TableTransformerForObjectDetection (DETR) struggles with tables with many small rows, most likely due to how its attention works. What was the motivation of training DETR to PubTables-1M rather than Deformable-DETR? (Wouldn't Deformable-Detr be strictly superior?)
Hi,
Deformable DETR is now available in 🤗 Transformers: https://huggingface.co/docs/transformers/main/en/model_doc/deformable_detr.
All checkpoints are on the hub: https://huggingface.co/models?other=deformable_detr.
The implementation supports both CPU and GPU (and you can choose to use the custom kernel or not when running on GPU). 🥳
Inference
For inference, I refer to the example code snippet in the docs.
Fine-tuning on custom data
For fine-tuning, I refer to this demo notebook, illustrating how to fine-tune the model. Fine-tuning Deformable DETR is equivalent to fine-tuning DETR (just replace
DetrForObjectDetection
in the notebook byDeformableDetrForObjectDetection
).