Open bmix opened 6 years ago
As an addition, it seems, the conversion results in lots of HTML elements with the same id
. The id
attribute must be unique within the document. It may be better to use the class
attribute.
Hi @bmix ,
Could you please test the new version with download from here (github)? It is enough to drop the py file into Anki's add-ons folder (can be opened from Anki -> Tools -> Add ons -> Open add ons older) and comment here your findings?
Thank you in advance Péter
Hello @peterborkuti , I find the following issues with the current revision:
<link type="text/css" rel="stylesheet" href="custom.css">
is not closed. It must be <link type="text/css" rel="stylesheet" href="custom.css"/>
div
without a surrounding p
or pre
element.Attribute value "container" of type ID must be unique within the document.
It may be better to use a class
attribute here (as long you don't want to calculate unique IDs)
There is a mismatch between the SYSTEM and PUBLIC doctype.
The System Identifier http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd declares the XHTML 1.0 Transitional document type, but the associated Public Identifier -/W3C//DTD XHTML 1.0 Transitional//EN does not match this document type.
there are many places, element-tags do not get closed. They are mainly container elements for text, like ´\
`
upon validation I also get this error message:
document type does not allow element "pre" here; missing one of "object", "applet", "map", "iframe", "button", "ins", "del" start-tag
which I do not really understand, since the <pre>
element should be allowed at that place, according to my knowledge. It may be, that the XML (since we produce XHTML we deal with XML) gets mixed up here with the deck's contents. The reason is, that my input file is a card deck, that teaches XML and therefore has a lot of source code examples, that seem to get transported uncleanly (sometimes special chars lile <
are escaped, sometimes the tags are written literarily. That may be an Anki issue, however.
I wanted to attache the two decks I used for this report (combined in a ZIP archive) but for some reason, Github does not accept the archive. If you want, I will make them available via other means.
It seems, that it may become a major task to create a clean XHTML export solution. Eventually, if you'd only go for HTML5, it may be more simple, but I am not sure. My interest in your project was getting an XML (since XHTML is XML) export of the Anki deck, but it may be more interesting to express the whole Anki deck dataset in its own XML dialcet, something I started on defining, but had to postpone, because of other projects I am working on.
Hi @bmix ,
A fixed the link closing issue and changed html doctype to html5. In my export there is no id attribute at all. I try to find info about what deck to export so I could make changing the document title, but I did not find any information nor have I any idea.
Originally this is not my project, I got an existing anki desk to html converter plugin and I modified that source according to my needs.
Now I am checked the source, and I understood that it uses the default anki desk to txt converter's output and it makes some search/replace in its output + adds some opening/closing html to the whole file and some styles.
If I would like to adapt it to your needs, I would have to understand how txt converter works and/or how Anki stores desk, etc, but now I am not using Anki, so to update this is not on my priority list, but feel free to fork it and modify it and/or send me pull requests.
Péter
Hi @bmix ,
I found an easy way how to make ids: I changed every id to a randomized string. Hope it helps. Could you try it?
Thank you in advance Péter
Now I get an empty file and this:
Traceback (most recent call last):
File "aqt\exporting.py", line 116, in accept
File "anki\exporting.py", line 19, in exportInto
File "C:\Users\bmix\AppData\Roaming\Anki2\addons\Export_html_glossary.py", line 140, in doExport
out += '<div class="Question">\n' + esc(c.q()) + "\n</div>\n"
File "C:\Users\bmix\AppData\Roaming\Anki2\addons\Export_html_glossary.py", line 120, in esc
return convertSound(self.escapeText(randomizeId(s)))
File "C:\Users\bmix\AppData\Roaming\Anki2\addons\Export_html_glossary.py", line 131, in randomizeId
return re.sub(r' +id *= *[\'"]*([^ \'">]+)[\'"]*', getRandomId, s, 0, re.IGNORECASE)
File "re.py", line 151, in sub
File "C:\Users\bmix\AppData\Roaming\Anki2\addons\Export_html_glossary.py", line 128, in getRandomId
return ' id="' + ''.join([random.choice(string.ascii_letters + string.digits) for n in xrange(32)])+'"'
NameError: global name 'random' is not defined
Hi @bmix ,
Sorry I forget to commit imports. Now I fixed. Unfortunately, the previous version did not cause exception at me, probably because my cards dont contain ids, so be patient, please, I could not test it. I hope it will work.
Thank you Péter
While the export happens in the XHTML namespace, the output is not valid! Since XHTML is XML, all attribute values must be quoted and all elements must be closed.
So, instead of:
it should read:
This should be dead easy to fix. The alternative would be to have to output be HTML4 or HTML5, which, however, would be a shame, since there is not yet any Anki2XML export I found, and this is the closest to it. One can simply write an XSL-T, which would take the XHTML (it's XML!) and transform it to whatever format one wants.