jgm / djot.js

JavaScript implementation of djot
MIT License
146 stars 16 forks source link

Djot CLI: Can not convert stand-alone documents via Pandoc #34

Closed tbdalgaard closed 1 year ago

tbdalgaard commented 1 year ago

If one tries to convert from docx or a stand-alone HTML-document via Pandoc to Djot it will not work at the moment. I presume that Djot can't handle the metadata blocks at the top of the HTML-files or those blocks stored somewhere in the docx container.

The interesting thing here is that Pandoc gladly converts from Djot to stand-alone HTML, but not the other way around.

jgm commented 1 year ago

When you say it "does not work," what do you mean? This gives me a djot document as output:

pandoc MANUAL.txt -t json -s | djot -f pandoc -t djot

Are you doing something different from that? The document won't contain metadata, because djot has no metadata yet.

tbdalgaard commented 1 year ago

Yes, I did something different. I tried the following:

I took a stand-alone HTML-document that I converted with Pandoc and ran:

pandoc myfile.html -f html -t json | djot -f pandoc -t djot > myfile.dj.txt

This gave some interesting errors.

If I convert from DOCX to Djot it fails too. I did:

pandoc myfile.docx -f docx -t json | djot -f pandoc -t djot > myfile.dj.txt

This produced errors too. I guess this is due to the fact that Djot cannot handle metadata which Pandoc supplies via the templates.

Sorry if I am trying to stretch Djot that much, but I really like the syntax and wish to replace it entirely from Markdown. I still like Markdown, I just like Djot much more.

Sendt fra min Mac Mini via Apple Mail

Den 13. jan. 2023 kl. 17.56 skrev John MacFarlane @.***>:

When you say it "does not work," what do you mean? This gives me a djot document as output:

pandoc MANUAL.txt -t json -s | djot -f pandoc -t djot Are you doing something different from that? The document won't contain metadata, because djot has no metadata yet.

— Reply to this email directly, view it on GitHub https://github.com/jgm/djot.js/issues/34#issuecomment-1382133685, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGLXRQ7FRU7X67EJBOWQ6TWSGCLFANCNFSM6AAAAAAT2KDA5I. You are receiving this because you authored the thread.

jgm commented 1 year ago

This gave some interesting errors.

It's more helpful if you actually give the errors. I tried it and in this case it's

Error: TypeError: Cannot read properties of undefined (reading 'DefaultDelim')
TypeError: Cannot read properties of undefined (reading 'DefaultDelim')
    at PandocParser.fromPandocBlock (/opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:732:54)
    at /opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:605:33
    at Array.map (<anonymous>)
    at PandocParser.fromPandocBlock (/opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:604:43)
    at /opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:850:25
    at Array.map (<anonymous>)
    at PandocParser.fromPandoc (/opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:849:34)
    at fromPandoc (/opt/homebrew/lib/node_modules/@djot/djot/lib/pandoc.js:863:38)
    at Object.<anonymous> (/opt/homebrew/lib/node_modules/@djot/djot/lib/cli.js:175:43)
    at Module._compile (node:internal/modules/cjs/loader:1218:14)% 

Is that what you saw? That points to a bug in djot's pandoc conversion, which should be rather easy to fix.

I had no trouble converting from docx, but if you did stumble across an error, please report it with full details and ideally a minimal docx we can use to reproduce it.

None of this has to do with metadata, I think.

tbdalgaard commented 1 year ago

Sorry. Here are the steps with results from Terminal pasted.


Convert from Djot to stand-alone HTML works. I did:
djot vers1.dj.txt -t pandoc | pandoc -f json -t html -s -o vers1.tml

Convert from this standalone HTML-document to Djot by doing:
pandoc vers1.html -t json | djot -f pandoc -t djot > vers2.dj.txt

gave this error:
Error: TypeError: Cannot read properties of undefined (reading 'replace')
TypeError: Cannot read properties of undefined (reading 'replace')
    at formatNumber (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:96:32)
    at ordered_list (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:252:34)
    at DjotRenderer.renderNode (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:623:13)
    at DjotRenderer.renderChildren (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:609:18)
    at section (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:200:22)
    at DjotRenderer.renderNode (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:623:13)
    at DjotRenderer.renderChildren (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:609:18)
    at section (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:200:22)
    at DjotRenderer.renderNode (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:623:13)
    at DjotRenderer.renderChildren (/usr/local/lib/node_modules/djot-js/lib/djot-renderer.js:609:18)%                                                                                                                                                                           ***@***.*** convert % pandoc vers1.html -t json | djot -f pandoc -t djot > ./errorlogs/convert-from-html-to-djot-log.txt

If I do the same thing, changing HTML to the DOCX format it will convert fine from Djot to DOCX, but if I try to convert back to Djot, I get the same erros as shown above.

I wonder if I have missed updating something here, or if it is related to the bug you mentioned.

Sendt fra min Mac Mini via Apple Mail

Den 13. jan. 2023 kl. 20.57 skrev John MacFarlane @.***>:

This gave some interesting errors.

It's more helpful if you actually give the errors. I tried it and in this case it's

Error: TypeError: Cannot read properties of undefined (reading 'DefaultDelim') TypeError: Cannot read properties of undefined (reading 'DefaultDelim') at PandocParser.fromPandocBlock @./djot/lib/pandoc.js:732:54) at @./djot/lib/pandoc.js:605:33 at Array.map () at PandocParser.fromPandocBlock @./djot/lib/pandoc.js:604:43) at @./djot/lib/pandoc.js:850:25 at Array.map () at PandocParser.fromPandoc @./djot/lib/pandoc.js:849:34) at fromPandoc @./djot/lib/pandoc.js:863:38) at Object. @.***/djot/lib/cli.js:175:43) at Module._compile (node:internal/modules/cjs/loader:1218:14)% Is that what you saw? That points to a bug in djot's pandoc conversion, which should be rather easy to fix.

I had no trouble converting from docx, but if you did stumble across an error, please report it with full details and ideally a minimal docx we can use to reproduce it.

None of this has to do with metadata, I think.

— Reply to this email directly, view it on GitHub https://github.com/jgm/djot.js/issues/34#issuecomment-1382325676, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGLXRSYNUISIPGL6XEADGLWSGXRFANCNFSM6AAAAAAT2KDA5I. You are receiving this because you authored the thread.

jgm commented 1 year ago

I suspect the fix I just pushed will take care of this issue, but it's hard to tell unless you can give us a file that suffices to reproduce it.

If you have a git clone of this repository you can try

$ git pull
$ npm install -g .

and see if that helps.

tbdalgaard commented 1 year ago

That did not fix it. The source file is a Danish produced test document. Would you like an English version instead? the HTML-document that gives the error was converted from Djot via Pandoc version 2.19.2. Could that be the problem here?

Sendt fra min Mac Mini via Apple Mail

Den 13. jan. 2023 kl. 21.53 skrev John MacFarlane @.***>:

I suspect the fix I just pushed will take care of this issue, but it's hard to tell unless you can give us a file that suffices to reproduce it.

If you have a git clone of this repository you can try

$ git pull $ npm install -g . and see if that helps.

— Reply to this email directly, view it on GitHub https://github.com/jgm/djot.js/issues/34#issuecomment-1382375416, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGLXRVQCBUH2HSKD3HELF3WSG6FBANCNFSM6AAAAAAT2KDA5I. You are receiving this because you authored the thread.

jgm commented 1 year ago

A Danish version would be just fine.

tbdalgaard commented 1 year ago

Great. I have attached this sample file

Sendt fra min Mac Mini via Apple Mail

Den 13. jan. 2023 kl. 22.37 skrev John MacFarlane @.***>:

A Danish version would be just fine.

— Reply to this email directly, view it on GitHub https://github.com/jgm/djot.js/issues/34#issuecomment-1382420653, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGLXRRCJLGTEAA4QFFQAPTWSHDJJANCNFSM6AAAAAAT2KDA5I. You are receiving this because you authored the thread.

jgm commented 1 year ago

This converted without errors using

pandoc ~/Downloads/vers1.html -t json -s | djot -f pandoc -t djot

If it doesn't work for you, then probably you aren't running the development version of djot.

tbdalgaard commented 1 year ago

Ok got Djot updated, and the document converted. Nice!

While reading the converted document I noticed the following:

The converted result shows both attributes like this: {#top} {#djot-test}

Djot test

Why does the vers2 document include both attributes?

What could be wrong with the command above?

Sendt fra min Mac Mini via Apple Mail

Den 14. jan. 2023 kl. 19.04 skrev John MacFarlane @.***>:

This converted without errors using

pandoc ~/Downloads/vers1.html -t json -s | djot -f pandoc -t djot

If it doesn't work for you, then probably you aren't running the development version of djot.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

jgm commented 1 year ago

I'm guessing you have a <section> with an id, and the first element of the section is a heading like <h2> with an id. Both are given here, but the second will take precedence. Sections are implicit in djot.

tbdalgaard commented 1 year ago

Ah I see. But how can I then make explicit link references to headings, if a conversion may add extra id-informations? I thought I did this by typing:{top{# djot tests But instead I got:{top}{djot-test}# djot testI have experimented with the --width option, but cannot get it to work.If I type:-w,0 as the cli help shows I get:Unknown optioninstead of a converted document with the original line length. Sendt fra min iPhone via Braille Sense U2/MBrailleDen 14. jan. 2023 kl. 21.24 skrev John MacFarlane @.***>: I'm guessing you have a

with an id, and the first element of the section is a heading like

with an id. Both are given here, but the second will take precedence. Sections are implicit in djot.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>