Closed ravidreams closed 8 years ago
I think the issue arose because of the Internet connectivity. When i rerun the do_ocr.py. All page convertion are well and uploaded the text in ta.wikisource.
When google can not ocr few text files, run the following command.
python create_dummy_files.py
This will create dummy text files for the incomplete pdf files.
Then, run again python do_ocr.py
to complete all the pending works.
Mediawiki_uploader.py not running if there is a missing page.
I tried creating this page manually and also tried following commands:
touch page_00001.txt touch page_00001.upload
This is a recurring problem for many files. Google won't OCR these pages and gets stuck when we try running do_ocr.py again.
Logged in to https://ta.wikisource.org INFO:root:Checking for bot access rights INFO:root:The user Ravidreamsbot has bot access. INFO:root: Done. Uploaded all text files to wiki source
mv: cannot stat ‘all_textfor’: No such file or directory mv: cannot stat ‘OCR_’: No such file or directory mv: cannot stat ‘upload-*’: No such file or directory mv: cannot stat ‘செந்தமிழ்ப்_பெட்டகம்-2.pdf’: No such file or directory