mindee / doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
https://mindee.github.io/doctr/
Apache License 2.0
3.6k stars 420 forks source link

[transforms] Extends the list of supported data augmentations #730

Closed fg-mindee closed 4 months ago

fg-mindee commented 2 years ago

As discussed in #654, the artefact detection needs to improve its robustness. In order to do so and prevent overfitting, I would suggest gradually extending the list of our supported transformations:

Ideally, a given transform should be implemented in doctr/transforms/modules so that with the corresponding backend, we can do:

transfo = ....
pil_img = ... 
augmented_img = transfo(pil_img)

for transformations that only change the image.

and:

transfo = ....
pil_img = ... 
target = {...}
augmented_img, augmented_target = transfo(pil_img, target)

for transformations that alter the target. In doing so, that will work nicely with our Dataloaders :+1:

felixdittrich92 commented 2 years ago

@frgfm only @SiddhantBahuguna PR left or do you have other augmentations in mind ? :)

frgfm commented 1 year ago

For now, I think the random perspective would be the last one required yeah :+1:

felixdittrich92 commented 1 year ago

Hi @SiddhantBahuguna any way in near future that you finish your draft PR ? :) Or should we keep it free to take ?

SiddhantBahuguna commented 1 year ago

Hi @SiddhantBahuguna any way in near future that you finish your draft PR ? :) Or should we keep it free to take ?

Greetings @felixdittrich92 , I am really sorry for the unwanted delay. I will get it done this week positively :) Sorry again !

felixdittrich92 commented 1 year ago

Sounds nice :hugs: :+1:

felixdittrich92 commented 7 months ago

@SiddhantBahuguna Do you think you could finish your PR that we can close this ? :)

felixdittrich92 commented 4 months ago

@odulcy-mindee I think we can close this ? I don't see any issues with different perspectives (detection models) your dataset seems to capture this already well. Or do you think we should add RandomPerspective as last one ftm ?

odulcy-mindee commented 4 months ago

Yeah, this issue can be closed, that's fine