Closed exnx closed 4 years ago
Hi @exnx , I tried lot of ways to collect real / natural data but I didnt got sufficient numbers for try deepfont So I implemented things with synthetic dataset. If you have any opensource real / natural data link Kindly share let me have a try.
@robinreni96 but Adobe provides some natural data (limited) for their test set right?
DeepFont dataset: https://www.dropbox.com/sh/o320sowg790cxpe/AADDmdwQ08GbciWnaC20oAmna?dl=0
Yes @exnx . Its just a test set they didnt release the train set even I tried to contact the author for the dataset but no luck.
They did release the train set. In their paper, they state for training they did two steps. First is a combined unsupervised feature encoder with an autoencoder, using synthetic (released under BCF Format/VFR_syn_train), and the natural data. This step uses no labels. Second step, they do a supervised training (leveraging the previously trained encoder by throwing away the decoder) on the synthetic data only, for classification.
Finally they test on the synthetic. The whole thing claims to be a domain adaptation technique, however, it did not work for me.
Sorry I confused with my other research work . Adobe claims VFR dataset is used for the Deepfont work but I don't know how only 4,384 images of real world data with labels is enough for this kind of system . I too tried same like you , but result was not as expected . I read one of the blogs , I don't remember the author he claimed Adobe uses additional real data for training to make things work. For public they just launched minimal level it and regarding that I contacted the author but no response. So I continued my work with synthetic dataset.
I am eager to know where your things are failing in this research .
Hi, I've tried to implement the DeepFont paper multiple times but I'm unable to reproduce the results. I was curious if you or anyone was able to get them font classifier to work on real / natural data, as opposed to the synthetic generated data. I was able to get good accuracy on the synthetic data fine.