Open QiuSYang opened 5 years ago
This kind of task is not currently supported. There is a WIP PR about single bounding box, you may want to use that as a temporary workaround if you have a single object to detect. Otherwise, adding an output feature for object detection wouldn't be that complicated, you may consider contributing it.
Please leave this open. It's a reminder of a feature request.
This kind of task is not currently supported. There is a WIP PR about single bounding box, you may want to use that as a temporary workaround if you have a single object to detect. Otherwise, adding an output feature for object detection wouldn't be that complicated, you may consider contributing it.
@w4nderlust hello,If I'm doing a multi-object detection task, can I just add an output feature to solve the problem? How should I set up the RPN network and ROI pooling layer? Do I need to add new encoder and decoder modules?
@QiuSYang thank you for your interest in working on this! So the first step for this I believe would be to create a new type of feature, something like BoundingBoxSet or Bounding Boxes. There is an open PR for a single bounding box, you can take inspiration from that i believe: https://github.com/uber/ludwig/pull/344 Once that is tested to work, What I would do is to take a model, like Faster-RCNN for instance, split it into an encoding part and a decoding part, the decoding part goes becomes one of the optional decoders of the BoundingBoxSet output feature, while the encoding part goes into a new encoder for the Image input feature. I can help you out with this, I'm going to be traveling next week to present at conferences, but when I'm back I can take a stab at how to split the model into encoder and decoder so that it would be easier for you. How does this sound?
@QiuSYang thank you for your interest in working on this! So the first step for this I believe would be to create a new type of feature, something like BoundingBoxSet or Bounding Boxes. There is an open PR for a single bounding box, you can take inspiration from that i believe: #344 Once that is tested to work, What I would do is to take a model, like Faster-RCNN for instance, split it into an encoding part and a decoding part, the decoding part goes becomes one of the optional decoders of the BoundingBoxSet output feature, while the encoding part goes into a new encoder for the Image input feature. I can help you out with this, I'm going to be traveling next week to present at conferences, but when I'm back I can take a stab at how to split the model into encoder and decoder so that it would be easier for you. How does this sound?
That would be great. Thank you very much
Just adding more context: This Issue is for single object detection https://github.com/uber/ludwig/issues/331 This WIP PR start implementing it: https://github.com/uber/ludwig/pull/344
Any updates about this feature?
@gustavorps unfortunately no, we have focused on other aspects in v0.3 and are now focusing on data preprocessing for v0.4. This feature will likely come in v0.5, but we would gladly accept contributions for this, which may accelerate the process.
if i need training a faster-rcnn model, How can I design my CSV file and yaml file?