aedocw / epub2tts-edge

epub2tts-edge uses Microsoft Edge cloud-based TTS to create a full featured audiobook m4b from an epub or text file
GNU General Public License v3.0
95 stars 14 forks source link

UnicodeEncodeError triggered by special characters #20

Closed brokenbulb8 closed 6 months ago

brokenbulb8 commented 6 months ago

Error created while converting .epub to .txt. Is there a fast way of detecting such characters and removing these errors. Is calibre the best app for fixing issues like this? Namespace(sourcefile='C:\Users\user\test.epub', speaker='en-US-AndrewNeural', cover=None) Cover image saved to C:\Users\user\test.png Exporting C:\Users\user\test.txt Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "C:\Users\user\.conda\envs\P311\Scripts\epub2tts-edge.exe\__main__.py", line 7, in <module> File "C:\Users\user\.conda\envs\P311\Lib\site-packages\epub2tts_edge\epub2tts_edge.py", line 373, in main export(book, args.sourcefile) File "C:\Users\user\.conda\envs\P311\Lib\site-packages\epub2tts_edge\epub2tts_edge.py", line 129, in export file.write(f"{clean}\n\n") File "C:\Users\user\.conda\envs\P311\Lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'charmap' codec can't encode character '\u2015' in position 92: character maps to <undefined> Thank You for making this software though.

aedocw commented 6 months ago

I have not run into any epubs that trigger this yet, but sometimes I do clean up the epub first with Calibre, that might be worth trying in your case. Here's what I do:

Let me know if that solves the issue. I have been reluctant to build in similar functionality to epub2tts-edge because it can turn into a real rabbit hole trying to decide how many changes to make.

brokenbulb8 commented 6 months ago

Thanks. Worked. I only converted utf-8 to utf-8, that's why it didn't work.