Open axaygaid opened 4 years ago
Hi @axaygaid Thank you for bringing this to my attention.
Could you try to replace to following cell:
ds = IAMDataset("word", output_data="text")
plot_image_with_text(ds)
to
ds = IAMDataset("word", output_data="text", root="../../dataset/iamdataset")
plot_image_with_text(ds)
and see if it solves your issue?
Hi jonomon;
thank's for the answer, it didn't work, same issue .. I think the trainset.txt file can't be downloaded, don't know why... i download the different library manually (security problem) then i put it in the dataset/iamdataset folder so i have all the iamdataset but not the other file as trainset i think ?
You should place trainset.txt
(as well as testset.txt
etc.) in data/iamdataset/subject
.
Please let me know if this works.
but what kind of data i have to put in the two txt file ? because i tried it before and the error was :
EmptyDataError: No columns to parse from file
So i have to put some data on it
thank's !
You should download the files here http://www.fki.inf.unibe.ch/DBs/iamDB/tasks/largeWriterIndependentTextLineRecognitionTask.zip
hey jonomon
thank's for the help, it was helpful... now running : test_ds = IAMDataset("form_original", train=False) "works" but when i try to plot an image i have nothing, like nothing is read ? and when i try i simple : len(ds) to check if there is something in it and it returns just 0... i'm checkin on the source if something is missing in my setting but if someone has any idea...
thank's a lot!
Hi @axaygaid,
It is hard for me to debug the issue without any information. What is the contents of data/iamdataset?
Hi @jonomon
A simple example is that when i run :
ds = IAMDataset("word", output_data="text")
*
that give this : <_io.TextIOWrapper name='/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/credentials.json' mode='r' encoding='UTF-8'>
so it's the good path
and
len(ds)
the output is 0
and in iamdataset folder : `os.listdir("/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/dataset/IAMDataset/") :
['.ipynb_checkpoints', 'ascii.gz', 'forms.txt', 'formsA-D.tgz', 'formsE-H.tgz', 'formsI-Z.tgz', 'image_data-form_original-text0.plk', 'image_data-form_original-text1.plk', 'image_data-form_original-text2.plk', 'image_data-form_original-text3.plk', 'image_data-word-text0.plk', 'image_data-word-text1.plk', 'image_data-word-text2.plk', 'image_data-word-text3.plk', 'largeWriterIndependentTextLineRecognitionTask.zip', 'lines.tgz', 'lines.txt', 'sentences.tgz', 'sentences.txt', 'subject', 'untitled.txt', 'words.tgz', 'words.txt', 'xml', 'xml.tgz']
i don't if it's clear now ? ><
It seems like the contents is missing a bunch of folders. See the example here.
The IAMDataset class should automatically download the IAM dataset and process the files. Was there something wrong with that step?
i download the different dataset (word, forms etc..) manually because i have a protection, i can't download directly big file such as IAMDataset, that's why the processing is not made i think but after the download, i extract every file and put in folder (all the .png of the form in form...) i thounght it could be enough
that's the iamdataset folder ... maybe i have to preprocess by myself if it doesn't work ... i have the same problem, i mean that the pipeline doesn't recognize the different picture :/
Thank's if you have an idea,.. :)
So you are not using the IAMDataset? If that's the case, you would have to customise the Gluon Dataset to your dataset.
This documentation provides information for it https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/data/datasets.html
If anybody executed it on Google colab ,please sharethe edited iam_dataset.py it with me , mahinqureship1@gmail.com
If anybody executed it on Google colab ,please sharethe edited iam_dataset.py it with me , mahinqureship1@gmail.com
Please share the iam_dataset.py file with me. (to use in colab). jpremnath06@outlook.com
Hey @jonomon, first off, thanks a lot for this repo.
There seems to be an issue in accessing the largeWriterIndependentTextLineRecognitionTask.zip file at http://www.fki.inf.unibe.ch/DBs/iamDB/tasks/largeWriterIndependentTextLineRecognitionTask.zip (E404)
Could you update the latest link in the code/point us to the file so it can be downloaded manually?
Thanks, Sambbhav
Hi @sambbhavgarg You can download it here https://fki.tic.heia-fr.ch/static/zip/largeWriterIndependentTextLineRecognitionTask.zip.
Regards, Jonathan
Hello guys, i have i think a simple problem : when i launch test_iam_dataset i have this error :
FileNotFoundError: [Errno 2] File /home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt does not exist: '/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt'
I don't know what kind of file is it
If someone has an idea, thank's a lot !