Open deshwalmahesh opened 2 years ago
any solution to this?
No Solutions to this. For now, you can use Detectron 2
as is given in the official Dit for object detection
Yes for the moment you need to use Detectron2 if you want to use DiT + Mask R-CNN.
However I'm working on adding support for it in Transformers
Hi @NielsRogge Any update on this? I assume it's probably lower prio for you. Just curious
Downgrading transformers
to version 4.32.0
worked for me.
I want to fine tune DiT for object detection (text, diagrams detection only) etc for my own dataset. Been searching through the web for quite some time but could not find anything on fine tuning a Transformers backbone for object detection.
Yout github answer for DETR for custom backbone describes how to change the backbone as you said that you can use ANY models from
timm
library and since there are almost 890 models present but unfortunately, notDiT
.HuggingFace model supports Feature Extraction as
BeitFeatureExtractor.from_pretrained("microsoft/dit-large")
so I think it could be used as a backbone but I found nothing on this one either.I tried changing the code on your tutorial for how to train DETR on custom data by replacing code in Cell 8,
but while running the code for Cell 11,
it gave me error as:
Can you please help me with the problem at hand?
Thank you :)