priya-dwivedi / Deep-Learning

MIT License
3.35k stars 2.5k forks source link

handwriting_recognition data set format #36

Open parkjh688 opened 6 years ago

parkjh688 commented 6 years ago

Hi.

I'm trying to excute your handwriting_recognition/English_Writer_Identification.ipynb jupyter notebook file. But I think your data set is quite different format to IAM data set.

Did you reorganize the data? If you didn't do anything to data, How to excute jupyter notebook code properly with data?

priya-dwivedi commented 6 years ago

Data should be in the format images and the writer. After that code should run fine. You may need to resize all images to same dimension before feeding into CNN.

Sent from my iPhone

On Sep 9, 2018, at 8:11 AM, Junghyun Park notifications@github.com wrote:

Hi.

I'm trying to excute your handwriting_recognition/English_Writer_Identification.ipynb jupyter notebook file. But I think your data set is quite different format to IAM data set.

Did you reorganize the data? If you didn't do anything to data, How to excute jupyter notebook code properly with data?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

parkjh688 commented 6 years ago

Thanks.

I want to ask my question in detail. The part of the loading data I need a directory which name is data_subset. I downloaded IAM data here, I couldn't find data_subset directory in there. Should I make that directory and organize data on my own from IAM data?

# Create array of file names and corresponding target writer names 
tmp = []
target_list = []
path_to_files = os.path.join('data_subset', '*')
for filename in sorted(glob.glob(path_to_files)):
    tmp.append(filename)
    image_name = filename.split('/')[-1]
    file, ext = os.path.splitext(image_name)
priya-dwivedi commented 6 years ago

Yes please make your own

On Sun, Sep 9, 2018 at 11:11 AM Junghyun Park notifications@github.com wrote:

Thanks.

But in this code, the part of the loading data I need a directory which name is data_subset. I downloaded IAM data here http://www.fki.inf.unibe.ch/databases/iam-handwriting-database, I couldn't find data_subset directory in there. Should I make that directory on my own?

Create array of file names and corresponding target writer names

tmp = [] target_list = [] path_to_files = os.path.join('data_subset', '*')for filename in sorted(glob.glob(path_to_files)): tmp.append(filename) image_name = filename.split('/')[-1] file, ext = os.path.splitext(image_name)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/priya-dwivedi/Deep-Learning/issues/36#issuecomment-419722511, or mute the thread https://github.com/notifications/unsubscribe-auth/AUHGOGJpi4fT1elrF3N6w3DBXVv9fAF3ks5uZS-xgaJpZM4WgQqa .

parkjh688 commented 6 years ago

Thanks!

supersaiyan12 commented 6 years ago

Hi Divya,

I'm excuting your handwriting_recognition/English_Writer_Identification.ipynb , But i have a dataset image where have handwritten letter and the rest are computer generated letters. So do i need to tweak in the written data or do i need to make some changes in the "data_subset" data.

emanuelevivoli commented 6 years ago

Hi @supersaiyan12 , as you can see I'm not @priya-dwivedi, but maybe I can help you. What do you want to do with your dataset? Do you want to classify the first one handwritten images from the others (the images generated by computer)?