robinreni96 / Font_Recognition-DeepFont

Its a implementation of DeepFont : Identify Your Font from An Image using Keras
MIT License
211 stars 72 forks source link

Does it work on the natural image test set? #1

Closed exnx closed 4 years ago

exnx commented 4 years ago

Hi, I've tried to implement the DeepFont paper multiple times but I'm unable to reproduce the results. I was curious if you or anyone was able to get them font classifier to work on real / natural data, as opposed to the synthetic generated data. I was able to get good accuracy on the synthetic data fine.

robinreni96 commented 4 years ago

Hi @exnx , I tried lot of ways to collect real / natural data but I didnt got sufficient numbers for try deepfont So I implemented things with synthetic dataset. If you have any opensource real / natural data link Kindly share let me have a try.

exnx commented 4 years ago

@robinreni96 but Adobe provides some natural data (limited) for their test set right?

DeepFont dataset: https://www.dropbox.com/sh/o320sowg790cxpe/AADDmdwQ08GbciWnaC20oAmna?dl=0

robinreni96 commented 4 years ago

Yes @exnx . Its just a test set they didnt release the train set even I tried to contact the author for the dataset but no luck.

exnx commented 4 years ago

They did release the train set. In their paper, they state for training they did two steps. First is a combined unsupervised feature encoder with an autoencoder, using synthetic (released under BCF Format/VFR_syn_train), and the natural data. This step uses no labels. Second step, they do a supervised training (leveraging the previously trained encoder by throwing away the decoder) on the synthetic data only, for classification.

Finally they test on the synthetic. The whole thing claims to be a domain adaptation technique, however, it did not work for me.

robinreni96 commented 4 years ago

Sorry I confused with my other research work . Adobe claims VFR dataset is used for the Deepfont work but I don't know how only 4,384 images of real world data with labels is enough for this kind of system . I too tried same like you , but result was not as expected . I read one of the blogs , I don't remember the author he claimed Adobe uses additional real data for training to make things work. For public they just launched minimal level it and regarding that I contacted the author but no response. So I continued my work with synthetic dataset.

I am eager to know where your things are failing in this research .