Closed judepayne closed 2 weeks ago
ping @krish777 @eddielao
ping @krish777 @eddielao
Eddie, please re-clone the project and re-run clj -X:test
and report any errors that remain on Windows. thx!
btw, hope this works :) The parse part of dictim is not for the faint hearted, being based on this EBNF grammar which I had to reverse engineer from the d2 docs. I try to avoid changing it!
["(* high level structure *)"
" <D2> = elements"
" elements = break* (element break)* element? | break* | element"
" <contained> = (element break)* element? | break* | element"
" <element> = list | elem"
""
" <elem> = classes | vars | ctr | attr | comment | conn | conn-ref"
""
" (* lists and comments *)"
" list = (elem <semi>+)+ elem <semi>*"
" comment = <s> <hash> lbl"
""
" (* containers - including shapes *)"
" ctr = (<s> ctr-key colon-label? (<curlyo> <break?> contained <s> <curlyc>)?) | composite-ctr"
" <composite-ctr> = <s> ctr-key <period> composite-ctr-attr"
" composite-ctr-attr = std-attr "
" ctr-key = !classes-lit !hash !vars-lit (ctr-key-part <period>)* ctr-key-part"
" ctr-key-part = !d2-keyword #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}*&<>])+'"
""
" (* vars, classes and attrs *)"
" (* vars *)"
" vars = <s> vars-lit <s> <colon> <s> vars-content"
" <vars-content> = <curlyo> <break?> the-vars <s> <break>? <s> <curlyc>"
" the-vars = (var <break>)* var"
" var = <s> ctr-key <colon> <s> (var-val | vars-content)"
" var-val = #'[^*{};\\r\\n]+' | inner-list"
""
" (* classes *)"
" classes = <s> classes-lit <s> <colon> <s> <curlyo> <at-sep*>"
" (class <at-sep*>)* <curlyc>"
" class = <s> ctr-key <s> <colon> <s> attrs"
""
" (* attrs *)"
" <at-sep> = break | semi"
" attr = std-attr | commented-attr"
" <std-attr> = (<s> attr-key <s> <colon> <s> (attr-val | attr-label? attrs))"
" commented-attr = <s> <hash> std-attr"
" attrs = <curlyo> <at-sep*> (attr <at-sep+>)* attr <at-sep*> <s> <curlyc>"
" attr-key = ((attr-key-part <period>)* (attr-key-last <period>)* attr-key-last) | globs"
" attr-key-part = &glob #'((?!(?:vars))[^.;:\\r\\n{}()])+'"
" attr-key-last = d2-keyword | amp d2-keyword"
" attr-label = label (* i.e. lbl, block or typescript *)"
" <item> = #'[^;\\[\\]\\r\\n]+'"
" inner-list = <'['> (item <semi> <s>)* item <']'>"
" <av> = !null #'[^*{};\\r\\n]+'"
" attr-val = av | substitution | <s> null <s> | inner-list"
""
" (* labels *)"
" <colon-label> = (<colon> <s> | <colon> <s> label)"
" label = lbl | block | typescript"
" <lbl> = (<s> null <s>) | normal-label | substitution"
" <normal-label> = !null #'[^;|{}\\r\\n]+'"
" <substitution> = <s> #'^(?!^\\s*$)([^;${\r"
"]*\\$\\{[^}]+\\})+[^{}\\r\\n;|]*'"
" block = <s> pipe #'[^|]+' pipe <s>"
" typescript = <s> ts-open #'[\\s\\S]+?(?=\\|\\|\\||`\\|)' ts-close <s>"
""
" (* connections *)"
" conn = <s> (conn-key dir)+ <s> conn-key colon-label? attrs?"
" conn-key = (conn-key-part <period>)* conn-key-part"
" conn-key-part = !d2-keyword #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}<>&()])+' (* greedy regex *)"
" dir = <contd?> <s> direction"
" contd = #'--\\\\"
"' | #'--\\\\\r\\n'"
" <direction> = '--' | '->' | '<-' | '<->'"
""
" (* conn-refs - a special type of connection *)"
" conn-ref = <s> conn-ref-key conn-ref-val"
" conn-ref-key = <'('> <s> crk <s> dir <s> crk <s> <')'> <'['> array-val <']'>"
" conn-ref-val = (conn-ref-attr-keys <s> <colon> <s> (attr-val | attrs) | <s> <colon> <s> null)"
" crk = #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}<>&()()])+'"
" conn-ref-attr-keys = (<period> d2-keyword)+"
" array-val = #'\\d' | globs"
""
" (* building blocks *)"
" <any> = #'.'"
" <d2-keyword> =('font-color'|'width'|'height'|'shape'|'class'|'grid-rows'|'tooltip'|'label'|'text-transform'|'stroke-width'|'vertical-gap'|'stroke-dash'|'shadow'|'3d'|'stroke'|'style'|'grid-columns'|'grid-gap'|'direction'|'fill'|'horizontal-gap'|'filled'|'font'|'animated'|'link'|'double-border'|'font-size'|'italic'|'fill-pattern'|'border-radius'|'source-arrowhead'|'target-arrowhead'|'comment'|'bold'|'underline'|'opacity'|'near'|'constraint'|'multiple'|'icon')"
" s = #' *'"
" globs = glob+"
" break = lr+"
" lr = #'[^\\S\\r\\n]*\\r?\\n'"
""
" (* literals *)"
" <single-quote> = '\\'' (* can't insert due to clojure/ instaparse escaping diffs *)"
" <double-quote> = '\\\"' (* ditto *)"
" <hyphen> = '-'"
" colon = ':'"
" <hash> = '#'"
" curlyc = '}'"
" null = 'null'"
" <r-arrow> = '>'"
" ts-open = '|||' | '|`'"
" <l-arrow> = '<'"
" vars-lit = 'vars'"
" pipe = '|'"
" classes-lit = 'classes'"
" <amp> = '&'"
" bracketc = ')'"
" ts-close = '|||' | '`|'"
" bracketo = '('"
" <period> = '.'"
" curlyo = '{'"
" <glob> = '*'"
" <semi> = ';'"
" "]
btw, hope this works :) The parse part of dictim is not for the faint hearted, being based on this EBNF grammar which I had to reverse engineer from the d2 docs. I try to avoid changing it!
["(* high level structure *)" " <D2> = elements" " elements = break* (element break)* element? | break* | element" " <contained> = (element break)* element? | break* | element" " <element> = list | elem" "" " <elem> = classes | vars | ctr | attr | comment | conn | conn-ref" "" " (* lists and comments *)" " list = (elem <semi>+)+ elem <semi>*" " comment = <s> <hash> lbl" "" " (* containers - including shapes *)" " ctr = (<s> ctr-key colon-label? (<curlyo> <break?> contained <s> <curlyc>)?) | composite-ctr" " <composite-ctr> = <s> ctr-key <period> composite-ctr-attr" " composite-ctr-attr = std-attr " " ctr-key = !classes-lit !hash !vars-lit (ctr-key-part <period>)* ctr-key-part" " ctr-key-part = !d2-keyword #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}*&<>])+'" "" " (* vars, classes and attrs *)" " (* vars *)" " vars = <s> vars-lit <s> <colon> <s> vars-content" " <vars-content> = <curlyo> <break?> the-vars <s> <break>? <s> <curlyc>" " the-vars = (var <break>)* var" " var = <s> ctr-key <colon> <s> (var-val | vars-content)" " var-val = #'[^*{};\\r\\n]+' | inner-list" "" " (* classes *)" " classes = <s> classes-lit <s> <colon> <s> <curlyo> <at-sep*>" " (class <at-sep*>)* <curlyc>" " class = <s> ctr-key <s> <colon> <s> attrs" "" " (* attrs *)" " <at-sep> = break | semi" " attr = std-attr | commented-attr" " <std-attr> = (<s> attr-key <s> <colon> <s> (attr-val | attr-label? attrs))" " commented-attr = <s> <hash> std-attr" " attrs = <curlyo> <at-sep*> (attr <at-sep+>)* attr <at-sep*> <s> <curlyc>" " attr-key = ((attr-key-part <period>)* (attr-key-last <period>)* attr-key-last) | globs" " attr-key-part = &glob #'((?!(?:vars))[^.;:\\r\\n{}()])+'" " attr-key-last = d2-keyword | amp d2-keyword" " attr-label = label (* i.e. lbl, block or typescript *)" " <item> = #'[^;\\[\\]\\r\\n]+'" " inner-list = <'['> (item <semi> <s>)* item <']'>" " <av> = !null #'[^*{};\\r\\n]+'" " attr-val = av | substitution | <s> null <s> | inner-list" "" " (* labels *)" " <colon-label> = (<colon> <s> | <colon> <s> label)" " label = lbl | block | typescript" " <lbl> = (<s> null <s>) | normal-label | substitution" " <normal-label> = !null #'[^;|{}\\r\\n]+'" " <substitution> = <s> #'^(?!^\\s*$)([^;${\r" "]*\\$\\{[^}]+\\})+[^{}\\r\\n;|]*'" " block = <s> pipe #'[^|]+' pipe <s>" " typescript = <s> ts-open #'[\\s\\S]+?(?=\\|\\|\\||`\\|)' ts-close <s>" "" " (* connections *)" " conn = <s> (conn-key dir)+ <s> conn-key colon-label? attrs?" " conn-key = (conn-key-part <period>)* conn-key-part" " conn-key-part = !d2-keyword #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}<>&()])+' (* greedy regex *)" " dir = <contd?> <s> direction" " contd = #'--\\\\" "' | #'--\\\\\r\\n'" " <direction> = '--' | '->' | '<-' | '<->'" "" " (* conn-refs - a special type of connection *)" " conn-ref = <s> conn-ref-key conn-ref-val" " conn-ref-key = <'('> <s> crk <s> dir <s> crk <s> <')'> <'['> array-val <']'>" " conn-ref-val = (conn-ref-attr-keys <s> <colon> <s> (attr-val | attrs) | <s> <colon> <s> null)" " crk = #'((?!(?:<-|->|<->|--))[^.;:\\r\\n{}<>&()()])+'" " conn-ref-attr-keys = (<period> d2-keyword)+" " array-val = #'\\d' | globs" "" " (* building blocks *)" " <any> = #'.'" " <d2-keyword> =('font-color'|'width'|'height'|'shape'|'class'|'grid-rows'|'tooltip'|'label'|'text-transform'|'stroke-width'|'vertical-gap'|'stroke-dash'|'shadow'|'3d'|'stroke'|'style'|'grid-columns'|'grid-gap'|'direction'|'fill'|'horizontal-gap'|'filled'|'font'|'animated'|'link'|'double-border'|'font-size'|'italic'|'fill-pattern'|'border-radius'|'source-arrowhead'|'target-arrowhead'|'comment'|'bold'|'underline'|'opacity'|'near'|'constraint'|'multiple'|'icon')" " s = #' *'" " globs = glob+" " break = lr+" " lr = #'[^\\S\\r\\n]*\\r?\\n'" "" " (* literals *)" " <single-quote> = '\\'' (* can't insert due to clojure/ instaparse escaping diffs *)" " <double-quote> = '\\\"' (* ditto *)" " <hyphen> = '-'" " colon = ':'" " <hash> = '#'" " curlyc = '}'" " null = 'null'" " <r-arrow> = '>'" " ts-open = '|||' | '|`'" " <l-arrow> = '<'" " vars-lit = 'vars'" " pipe = '|'" " classes-lit = 'classes'" " <amp> = '&'" " bracketc = ')'" " ts-close = '|||' | '`|'" " bracketo = '('" " <period> = '.'" " curlyo = '{'" " <glob> = '*'" " <semi> = ';'" " "]
I will test it after I get home from office this evening. I was thinking to test it with ? after r like line 218, to make \r
optional.
lr = #'[^\\S\\r\\n]*\\r?\\n'
So the same line of code can work for both Windows and Mac/Linux. Thank you @judepayne !
Hi @eddielao I saw your PR, but can you please verify whether my fix above also fixes the issue with Windows line endings, then I can think about which is better
Hi @eddielao I saw your PR, but can you please verify whether my fix above also fixes the issue with Windows line endings, then I can think about which is better
Hi @judepayne , I had to include your changes (my branch is with the latest main branch changes) because they fixed most of the failing tests except the one below.
Eddie@LZXTH1 MINGW64 ~/OneDrive/Dev/git/dictim (main) $ git log -2 commit f8b2ab18fccd5df06f1800ed4171f00b278a5ce8 (HEAD -> main, origin/main, origin/HEAD) Author: Jude Payne judepayne@yahoo.co.uk Date: Fri Aug 30 20:07:35 2024 +0100
untested fix for Windows line return
commit 93835345dc26ca9e25c5ee933fb80f4ad642ee65 Author: Jude Payne judepayne@yahoo.co.uk Date: Tue Aug 27 21:22:53 2024 +0100
tidy up of commented out attrs
Eddie@LZXTH1 MINGW64 ~/OneDrive/Dev/git/dictim (main) $ clj -X:test
Running tests in #{"test"}
Testing dictim.d2.compile-test
Testing dictim.d2.parse-test
FAIL in (textandcode) (parse_test.clj:254)
Text and Code
expected: (= dict (quote (["explanation" "|md\n # I can do headers\n - lists\n - lists\n\n And other normal markdown stuff\n|"] ["my_code" "|||ts\n declare function getSmallPet(): Fish | Bird;\n const works = (a > 1) || (b < 2)\n|||"] ["mycode" "|ts\n declare function getSmallPet(): Fish | Bird;\n const works = (a > 1) || (b < 2)\n
|"] ["amscd plugin" ["ex" "|tex\n\\begin{CD} B @>{\\text{very long label}}>> C S^{{\\mathcal{W}}\\Lambda}\\otimes T @>j>> T\\\\ @VVV V \\end{CD}\n|"]] ["multilines" ["ex" "|tex\n\\displaylines{x = a + b \\\\ y = b + c}\n\\sum{k=1}^{n} h{k} \\int{0}^{1} \\bigl(\\partial{k} f(x{k-1}+t h{k} e{k}) -\\partial{k} f(a)\\bigr) \\,dt\n|"]] ["title" "A winning strategy" {"shape" "text", "near" "top-center", "style" {"font-size" 55, "italic" true}}] ["poll the people" "->" "results"] ["results" "->" "unfavorable" "->" "poll the people"] ["results" "->" "favorable" "->" "will of the people"])))
actual: (not (= (["explanation" "|md\r\n # I can do headers\r\n - lists\r\n - lists\r\n\r\n And other normal markdown stuff\r\n|"] ["my_code" "|||ts\r\n declare function getSmallPet(): Fish | Bird;\r\n const works = (a > 1) || (b < 2)\r\n|||"] ["mycode" "|ts\r\n declare function getSmallPet(): Fish | Bird;\r\n const works = (a > 1) || (b < 2)\r\n
|"] ["amscd plugin" ["ex" "|tex\r\n\\begin{CD} B @>{\\text{very long label}}>> C S^{{\\mathcal{W}}\\Lambda}\\otimes T @>j>> T\\\\ @VVV V \\end{CD}\r\n|"]] ["multilines" ["ex" "|tex\r\n\\displaylines{x = a + b \\\\ y = b + c}\r\n\\sum{k=1}^{n} h{k} \\int{0}^{1} \\bigl(\\partial{k} f(x{k-1}+t h{k} e{k}) -\\partial{k} f(a)\\bigr) \\,dt\r\n|"]] ["title" "A winning strategy" {"shape" "text", "near" "top-center", "style" {"italic" true, "font-size" 55}}] ["poll the people" "->" "results"] ["results" "->" "unfavorable" "->" "poll the people"] ["results" "->" "favorable" "->" "will of the people"]) (["explanation" "|md\n # I can do headers\n - lists\n - lists\n\n And other normal markdown stuff\n|"] ["my_code" "|||ts\n declare function getSmallPet(): Fish | Bird;\n const works = (a > 1) || (b < 2)\n|||"] ["mycode" "|ts\n declare function getSmallPet(): Fish | Bird;\n const works = (a > 1) || (b < 2)\n
|"] ["amscd plugin" ["ex" "|tex\n\\begin{CD} B @>{\\text{very long label}}>> C S^{{\\mathcal{W}}\\Lambda}\\otimes T @>j>> T\\\\ @VVV V \\end{CD}\n|"]] ["multilines" ["ex" "|tex\n\\displaylines{x = a + b \\\\ y = b + c}\n\\sum{k=1}^{n} h{k} \\int{0}^{1} \\bigl(\\partial{k} f(x{k-1}+t h{k} e{k}) -\\partial{k} f(a)\\bigr) \\,dt\n|"]] ["title" "A winning strategy" {"shape" "text", "near" "top-center", "style" {"font-size" 55, "italic" true}}] ["poll the people" "->" "results"] ["results" "->" "unfavorable" "->" "poll the people"] ["results" "->" "favorable" "->" "will of the people"])))
Testing dictim.flat-test
Testing dictim.graphspec-test
Testing dictim.json-test
Testing dictim.template-test
Ran 83 tests containing 361 assertions. 1 failures, 0 errors. Execution error (ExceptionInfo) at cognitect.test-runner.api/test (api.clj:30). Test failures or errors occurred.
Full report at: C:\Users\Eddie\AppData\Local\Temp\clojure-8551768610252755755.edn
Eddie@LZXTH1 MINGW64 ~/OneDrive/Dev/git/dictim (main)
Thanks will look into.
@eddielao Please re-clone and run the tests again + let me know.
In d2, label can be in markdown form when surrounded by |md
and |
. Inside those blocks, it's standard markdown, including \n
or \r\n
on Windows. dictim does not attempt to parse these blocks, taking them verbatim. That's why the above test was failing. That test loads the d2 in test/dictim/d2/samples/textandcode.d2
(and parses it). On windows that file must have the \n
line endings replaced with \r\n
.
Correct fix
During development of the dictim parser (say we need to add new features because d2 adds new features) we're working at the repl and usually calling the parse-d2
function in the dictim.d2.parse
ns rather than the dictim
function. That's why I was keen to fix the \r
issues as much as possible in the parser (previous push) .. to support development on both Windows as well as other platforms.
However we do need your normalize-line-endings
fn where your PR put it - in the dictim function as well to capture this one case of markdown labels.
I will accept your PR. thank you.
Thank you, @judepayne ! I still have a lot to learn and really appreciate your detailed explanation!!
On Mac and Linux, line return is \n, whereas on Windows its \r\n
Dictim is not correctly accounting for Windows, and this is leading to a number of tests failing in the d2/parse-test namespace.
e.g.