Train a SegFormer on a Custom Dataset

roboflow / notebooks

Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.

https://roboflow.com/models

4.89k stars 759 forks source link

Train a SegFormer on a Custom Dataset #99

Closed jhayle21 closed 1 year ago

jhayle21 commented 1 year ago

Search before asking

[X] I have searched the Roboflow Notebooks issues and found no similar bug report.

Notebook name

Roboflow How to Train SegFormer.ipynb

Bug

REDUCE CODE PL TRAINER

i just copy the code and try it but it wont work.

Environment

-Google Colab

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

[X] Yes I'd like to help by submitting a PR!

github-actions[bot] commented 1 year ago

👋 Hello @jhayle21, thank you for leaving an issue on Roboflow Notebooks.

🐞 Bug reports

If you are filing a bug report, please be as detailed as possible. This will help us more easily diagnose and resolve the problem you are facing. To learn more about contributing, check out our Contributing Guidelines.

If you require support with custom code that is not part of Roboflow Notebooks, please reach out on the Roboflow Forum or on the GitHub Discussions page associated with this repository.

💬 Get in touch

Do you have more questions about Roboflow that we haven't responded to yet? Feel free to ask them on the Roboflow Discuss forum. Our developer advocates and community team actively respond to questions there.

To ask questions about Notebooks, head over to the GitHub Discussions section of this repository.

SkalskiP commented 1 year ago

@Jacobsolawetz could you take a look at this bug report?

mazatov commented 1 year ago

I had the same error just now. Were you able to figure it out, @jhayle21 ?

robmarkcole commented 1 year ago

This is because pytorch-lightning dependency is not stated, and running now will install '2.0.1.post0' which has breaking changes from 1.X. In 2.X lightning will discover available devices, so this arg is not needed.

On commenting out the GPU arg, you can run but get error:

NotImplementedError: Support for `validation_epoch_end` has been removed in v2.0.0. `SegformerFinetuner` implements
this method. You can use the `on_validation_epoch_end` hook instead. To access outputs, save them in-memory as 
instance attributes.

Note there is also an attribute error from feature_extractor.reduce_labels = False

Therefore this notebook needs updating for lightning 2.0

mazatov commented 1 year ago

@robmarkcole, Thank you 100% agree. It also applies to test_epoch_end ---> on_test_epoch_end

SkalskiP commented 1 year ago

Hi @jhayle21, @mazatov and @robmarkcole 👋🏻! I just pushed a fixed version of our SegFormer notebook. It is not ideal, but it works as expected. What I did was:

pytorch-lightning<2.0.0 - fix the version of PyTorch Lightning to be the latest version below 2.0.0. In the future, we need to update API used in the notebook. But for now, it should be fine.
allow for latest version of roboflow to enable new authentication API
update transformers API reduce_labels -> do_reduce_labels

I'm closing the issue but feel free to reopen it in the future if needed.

mazatov commented 1 year ago

Thanks @SkalskiP , I was able to run it. I wonder if you can provide some input on the outputs I'm getting. I'm training a model on soccernet dataset with segmented soccer pitch lines. The output of the model is a bit strange.

While it's obvious that it is learning something it seems to struggle to learn background class. I wonder if I have to do something special for that as in the dataset, background is the largest class. Currently it's labeled as 0, and has a special name in the class list.

SkalskiP commented 1 year ago

Hi, @mazatov 👋🏻!

This looks like a very interesting project. How large is the dataset? How long did you train?

mazatov commented 1 year ago

The train dataset is about 16,000 images. I trained it only for 10 epochs hoping to see some progress before moving away from colab to something where it won't get interrupted. I was surprised to see this type of output. Usually other networks immediately give something quite reasonable very fast

On Tue, Apr 25, 2023, 00:03 Piotr Skalski @.***> wrote:

Hi, @mazatov https://github.com/mazatov 👋🏻!

This looks like a very interesting project. How large is the dataset? How long did you train?

— Reply to this email directly, view it on GitHub https://github.com/roboflow/notebooks/issues/99#issuecomment-1520822908, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG7PD63UOGGQQNUUIUC77XDXC3TDNANCNFSM6AAAAAAW7M4PDY . You are receiving this because you were mentioned.Message ID: @.***>

SkalskiP commented 1 year ago

The main problem here is the fact that the objects you try to detect are thin. In the past, I trained two models like that. And I managed to do that with Mask R-CNN and YOLOv7.

mazatov commented 1 year ago

I've tried Yolov7/8 and they do alright. I was just hoping to get better results with segformer or mask2former.