tlienart / Franklin.jl

(yet another) static site generator. Simple, customisable, fast, maths with KaTeX, code evaluation, optional pre-rendering, in Julia.
https://franklinjl.org
MIT License
964 stars 114 forks source link

Jekyll/JuDoc compatibility comments #201

Closed cormullion closed 5 years ago

cormullion commented 5 years ago

Hi Thibaut! Out of curiosity I imported the official Julia language blog (https://julialang.org/blog/), a Jekyll-oriented blog, into JuDoc. You never know, one day they may go full Julia... :)

Screenshot 2019-08-30 at 10 01 15

I needed to edit 14 of the 94 Markdown files to get them accepted by the JuDoc server and on to a local static site, and I though you'd be interested to know what the differences were. (It's cool that JuDoc was very assiduous — it even spotted some Markdown errors which are currently still visible on the official site. :))

Obviously there are some general changes that are needed to convert any Jekyll blog post to JuDoc, but these would be obvious and easy to do, given an hour or two, a supply of coffee, and some regular expressions; for example, the slugs at the top of each file, pathnames to images, etc. So I'll mention here just the more interesting ones.

1 Dollar signs

In Jekyll you can write "each awardee was presented with certificate of accomplishment and a cash prize of $1000." but this triggers JuDoc's LATEX knowledge. Perhaps unmatched dollar signs could be left in place, I don't know...

People use dollar signs for shell prompts a lot too:

> $ julia 01-speech-blstm.jl

which all needed fixing.

2 Double backticks

eg in this post here these all need converting to three backticks. The original Markdown spec uses two backticks I think.

3 Math issues

I managed to fix a number of math issues (surprising myself considering I rarely use LATEX, MathJax, or KATEX). Perhaps these could be listed somewhere, I don't know. For example, in the same post:

2. For \\(i = 1,\ldots,n\\), evaluate \\(f_i(x) \\) and update corresponding model \\( m_i \\).

triggered the "command used before defined" error for ldots, which I fixed by replacing the \\ with $. Happens with many \-things, like alpha too.

The following popular syntax caused errors, and I enclosed the [[ ]] in $s to fix them.

by application of sigmoid functions. Specifically,  [[ML(x)=\sigma(W_{3}\cdot\sigma(W_{2}\cdot\sigma(W_{1}\cdot x)))]] is a three-layer deep

Sometimes it just required only a terminating $ to fix (which annoyingly contradicts the dollar sign comment above! :) ):

$[[\text{rabbits tomorrow} = \text{Model}(\text{rabbits today}).]]

4 Julia code

Everyone used 4 space indenting to format their Julia code. This usually works, but not always. For example, I could only get Prof Johnson's post through JuDoc by converting one of the code blocks

m = GSL_Minimizer()
gsl_minimizer_set!(m, sin, -1, -3, 1)
while gsl_minimizer_xmax(m) - gsl_minimizer_xmin(m) > 1e-6
    println("iterating at x = $(gsl_minimizer_x(m))")
    gsl_minimizer_iterate!(m)
end
println("found minimum $(gsl_minimizer_f(m)) at x = $(gsl_minimizer_x(m))")

to use the ```julia ... code ``` multiline syntax. Probably the dollar signs again?

I also had problems with (ironically) my post there, which has this:

julia> Base.REPLCompletions.latex_symbols["\\pi"]
"π"

which had to be converted to triple-backtick syntax too, for some reason (some interaction between previously defined commands perhaps).

Stefan's piping post

The most challenging post to convert is Stefan's post about piping and shell escapes, which is understandable given that it's all about backticks and escapes... :) For example:

The `"Hello\n"` after the `readall` command is a returned value, whereas the `Hello` after the `run` command is printed output.

displays as:

Screenshot 2019-08-30 at 10 38 33

I don't think the \n should be literal here?

The post is probably worth going through, because I struggled to know what was being escaped rightly or wrongly at times!

Anyway, whether they're differences in Markdown syntax, possible tweaks for your parsing, or just good material for a tips/advice section in the documents, I hope some of this is useful.

tlienart commented 5 years ago

Oh wow, this is amazing, thanks so much! I was planning to do a bit of coding on this project this week-end so this couldn't have come at a better time. I'll try to fix things as well as I can and if I can't, will open issue ;-) thanks a lot!

(Also, I haven't looked at everything in details yet, so I may make further comments here if there are things requiring disscussions or feedback)

tlienart commented 5 years ago

Ok, I've now gone through your post in a bit more details and opened an issue (#202) to maybe help users do exactly what you've done 😄

A few comments / questions where feedback would be very welcome:

asdf

```bash
> $ blah blah

in text: > $ julia 01-speech-blstm.jl



gives

-----

<img width="465" alt="Screen Shot 2019-09-01 at 20 18 50" src="https://user-images.githubusercontent.com/10897531/64080523-bbb89200-ccf5-11e9-891a-e8f1d4037236.png">

-----

as intended I would think

* I think fenced code blocks should be preferred over quadruple indented code blocks but I understand what you say that a lot of people may just use that assuming that all the code on the page is in the same language (presumably Julia). I think a migration tool could help with this; I'll also have a think about whether I could include this in the parser. 
* thanks for reporting the double backticks thing, I definitely did not think about it, is this spec'd somewhere that you'd be aware of? (there's notoriously many specs for markdown though). Is this always meant to be in inline code?  so in current Judoc syntax would just be single backticks?

Thanks again for this!
cormullion commented 5 years ago

Hi! Sorry, yes, I did have a bit of trouble understanding what/if the problems were sometimes myself, so some of them may be false alarms (I was quickly making edits to make the error messages disappear but because there's no explicit link between error messages and source line number it was tricky sometimes...) I think that that m = GSL_Minimizer() indented code block I mentioned was rejected because a previous paragraph had started a $ and it was still current in some way. (?)

I think the original Markdown spec from Mr Gruber uses double backticks to escape something with single backticks, and triple backticks would then be needed if you wanted to literally include stuff with two backticks, and so on to backtick infinity and beyond.

If you want some test documents, how about running two of Stefan's posts (https://github.com/JuliaLang/www.julialang.org/blob/master/blog/_posts/2013-04-08-put-this-in-your-pipe.md and https://github.com/JuliaLang/www.julialang.org/blob/master/blog/_posts/2012-03-11-shelling-out-sucks.md) through a server session. Unpaired backslashes, dollar signs, and single tildes abound there... Probably a good test case. And he uses Ruby code as well as Julia, to make life more difficult :)

Last time I wrote a Markdown parser everything marked as code (indented or backticked) was immediately unavailable for any subsequent pattern matching. Is this still the case? - I haven't looked at Julia's Markdown support.

Here's another couple of oddities:

|   | Name |
| ------------- | ------------- |
|Hi | There |

doesn't form a table unless you insert a character such as a dot in the blank space:

| .  | Name |
| ------------- | ------------- |
|Hi | There |

And people are using these ampersand characters a lot - they're going to need escaping with ~~~, I suppose?

## &pi; in Julia

Not sure if this is a Julia Markdown vs Jekyll Markdown thing though.

I suspect a list of changes required to move from Jekyll or whatever to JuDoc would be easier to maintain at first; perhaps at a later date you could make something automated. (Machine learning? :))

tlienart commented 5 years ago

For the ampersand thing, I guess I could find all those that match something like this list or this one and fence them, I'll have a look. Maybe the easiest way would be to include a command \&... (so just precede it with a backslash) and it would perserve the expression and pass it to the HTML untouched. By the way would you happen to know why people would use &pi instead of π ? is it just out of habit or is it better to feed HTML with these &pi rather than the symbol because of font issues?

Thanks again for all the feedback!

cormullion commented 5 years ago

No sure why people do things, really, probably just when something works you keep doing it that way... :)

I think people will generally be happiest if their usage of standard/Julia Markdown can be transferred without too many modifications, even if their Jekyll habits need to change. I'm not too knowledgeable about which feature is which, to be honest. :)

tlienart commented 5 years ago

Ok, good news, on the master branch (which is now very near v0.3 release) I believe I've fixed all issues mentioned here.

I've created a repository which has 5 more-or-less typical julia blog posts converted, @cormullion I would still be very happy to get extra feedback (maybe in a new issue); possibly you could start from a clone of that repository if you wanted to?

The results can be visualised here (note that this is using the sandbox template so the CSS is fairly barebone but it doesn't really matter here)

A few notes

(these are also mentioned in the readme of the repo linked to above)

out_md = replace(in_md, r"\\(|\\)" => "\$")
*[Metacharacter brittleness.](#Metacharacter+Brittleness)*

should become

*[Metacharacter brittleness.](#metacharacter_brittleness)*

Other notes

All these notes will be added to the docs "soon™"

So I'll close this here and @cormullion if you're still up for some more testing, I'll be glad to read from you 🙂. Maybe open a new issue as I believe all issues here have been addressed (🙏) thanks a lot!

RoyiAvital commented 4 years ago

Could you add support for MathJaX in addition to KaTeX?

KaTeX is really limited.

tlienart commented 4 years ago

Can you open another issue about this? basically Franklin will not directly support mathjax soon but you can already use it yourself by adding the library in _layout.html/head.html (where KaTeX currently is) and then making sure that you escape the place where you write maths so that it's not snapped by Franklin's parser (it might otherwise work but I've not tested this)

So basically having

Some markdown
~~~
$$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$
~~~ 
Then something else and ~~~\(a \ne 0\)~~~

assuming you have <script type="text/javascript" src="/MathJax/MathJax.js"></script> in your head.html should just work (please kindly report).