Kohulan / DECIMER-Image_Transformer

DECIMER: Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer
MIT License
197 stars 51 forks source link

Backbone CNN Model #75

Closed qbKhanh closed 1 year ago

qbKhanh commented 1 year ago

Issue Type

Questions

Source

GitHub (source)

DECIMER Image Transformer Version

Not relevant

OS Platform and Distribution

Not relevant

Python version

Not relevant

Current Behaviour?

First, I want to thank for your hard-working with an awesome open-source. I just have a naive question that Why do you choose EfficientNet for backbone CNN model but not ResNet or others?

Thank for reading! Hope you reply soon.

Which images caused the issue? (This is mandatory for images related issues)

No response

Standalone code to reproduce the issue

Not relevant

Relevant log output

No response

Code of Conduct

Kohulan commented 1 year ago

@qbKhanh ,

Thank you for your interest in our research. Our experimentation involved testing several Convolutional Neural Networks (CNNs), and our findings have led us to the conclusion that employing EfficientNetV2 significantly enhances tool performance. Additionally, the utilization of EfficientNetV2 has proven to substantially expedite training on Tensor Processing Units (TPUs). Hope this information helps.

Kind regards, -Kohulan

qbKhanh commented 1 year ago

@Kohulan, Thank for your quick support. I have another question that How much data that you think is enough to fine-tune on?

Kohulan commented 1 year ago

@qbKhanh , I would recommend trial and error in this regard. As a starting point, I would recommend 1 million images.

qbKhanh commented 1 year ago

@Kohulan, Thank you, but 1 million images would be a big deal!