Closed Hawley-Griffin closed 4 years ago
Hey,
So I have had a look into this and there is more then one issue going on.
The file is a format of unicode which I had not seen before and needs to be unencoded in that particular format
There is a non-standard list style being used for you list which means that they are not being parsed correctly for each level
Some line formatting issues where part of the line was being cut off
I will do best to try and fix the issues with word as do want to support it overall long term.
If you would like a work around I would suggest using LibreOffice instead. It is a free alternative to word and output much more consistent html documents so is probably more stable
Are multi-level lists from .docx files supported? If so, are there any formatting or other limitations? Or some other requirements?
Currently running into an error when I try to import into Anki.
Raw text of the file you tried to upload
Word and HTML files: Anki Addon (Lists to Anki) Error Report.zip
Error report from the popup
The error was 'utf-8' codec can't decode byte 0xb7 in position 53017: invalid start byte.
Error report: Traceback (most recent call last): File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148__init__.py", line 52, in importNewFile parseAndUploadOrgFile(filePath, embedded=True) File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148\org_to_anki\main.py", line 29, in parseAndUploadOrgFile _parseAndUpload(filePath, embedded) File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148\org_to_anki\main.py", line 45, in _parseAndUpload deck = parseData.parse(filePath) File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148\org_to_anki\org_parser\parseData.py", line 17, in parse formatedData = convertBulletPointsDocument(filePath) File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148\org_to_anki\converters\BulletPointHtmlConverter.py", line 27, in convertBulletPointsDocument documentType = checkDocumentType(filePath) File "C:\Users\Hawley Griffin\AppData\Roaming\Anki2\addons21\1029306148\org_to_anki\converters\BulletPointHtmlConverter.py", line 37, in checkDocumentType soup = BeautifulSoup(htmlFile, 'html.parser') File "lib\site-packages\bs4__init.py", line 245, in init__ File "C:\Program Files\Python36\lib\codecs.py", line 700, in read File "C:\Program Files\Python36\lib\codecs.py", line 503, in read UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb7 in position 53017: invalid start byte
What is your operating system
Windows 10
What was the original file type
Word File (.docx) converted to .html