TuringLang / docs

Documentation and tutorials for the Turing language
https://turinglang.org/docs/
MIT License
231 stars 99 forks source link

Tutorials are largely outdated #86

Closed torfjelde closed 3 years ago

torfjelde commented 4 years ago

A lot of the tutorials are unfortunately outdated due to recent improvements in MCMCChains, e.g. https://github.com/TuringLang/Turing.jl/issues/1411.

I'm opening an issue here so as to bring attention to the, well, issue.

devmotion commented 4 years ago

I think maybe it would help to 1) add separate environments for different tutorials (so people can reproduce it even though it is based on an older version of some package) and 2) use CI tests and builds for the tutorials.

torfjelde commented 4 years ago

100%. But for (2), we likely have to move away from Jupyter notebooks, right? Which, honestly, would be it's own benefit.

torfjelde commented 4 years ago

And regarding (1), I would say that ideally we'll always have the tutorials work with the most recent release of Turing, since people will look to this for inspiration when writing their own code.

Maybe we should run Pkg.status() at the beginning or end of every notebook too, so that people can view the dependencies.

devmotion commented 4 years ago

Maybe we should run Pkg.status() at the beginning or end of every notebook too, so that people can view the dependencies.

That's how it is done in SciML: https://github.com/SciML/SciMLTutorials.jl An appendix is added automatically to every tutorial which shows the packages and Julia version that was used to run it

torfjelde commented 4 years ago

That's how it is done in SciML

Nice :+1: And they use jmd as the "base" file?

devmotion commented 4 years ago

Yes. And then notebook, HTML, and PDF files are generated with Weave.

torfjelde commented 4 years ago

Super-dope. Would it be a good idea to just try and duplicate their approach?

devmotion commented 4 years ago

Yes, I think that would be great. In particular also thinking about https://github.com/SciML/RebuildAction which we use over at SciML to rebuild tutorials automatically (together with CompatHelper for individual environments this helps to figure out any possible issues).

torfjelde commented 4 years ago

Yeah I just noticed that; awesome stuff! Cool, well I'm very down for doing that.

cpfiffer commented 4 years ago

All of this is a great idea. I hate the way the tutorials currently work -- they're difficult to write in, difficult to automate, and often out-of-date. Switching to a more general jmd approach is something I'm all for.

trappmartin commented 4 years ago

Super cool idea.

torfjelde commented 4 years ago

I've now updated all the tutorials to work with the new system (which is in a separate branch jmd-staging, which we'll merge into master when everything is ready for a full update).

Do have a look:) We also need an updated script to publish the tutorials. One approach could be to just generate the markdown from the jmd file and then use the current Jekyll setup. Maybe that's a good place to start?

devmotion commented 4 years ago

One approach could be to just generate the markdown from the jmd file and then use the current Jekyll setup. Maybe that's a good place to start?

Yes, I think that's what we should do (at least initially). IMO we should not update too many things in one step :slightly_smiling_face:

torfjelde commented 4 years ago

@cpfiffer I might have some questions for you about the documentation setup later, because right now I'm trying to run it locally but using a different branch of TuringTutorials, and I swear I'm loosing hair as I type. It for some reason refuses to render the the pages, despite the markdown files being available in _tutorials/ on the top-level (I messed around with the copy_tutorials method to do so). It's on the jmd-staging branch if you want to have a go yourself. You generate markdown just by doing

julia --project -e "using Pkg; Pkg.instantiate(); using TuringTutorials; TuringTutorials.weave_folder(\"introduction\", (:github, ))"

while in the project folder.

cpfiffer commented 4 years ago

Do the rendered markdown files have YAML headers? Because otherwise they won't render. The old tutorials code would handle this automatically:

https://github.com/TuringLang/TuringTutorials/blob/38257ab7036575c63113a5b5556c8cd5fd20cb31/weave-examples.jl#L27

torfjelde commented 4 years ago

Aaah, that might be it

torfjelde commented 4 years ago

Btw, a small update: I can get the tutorials to render but the formatting is completely broken. Honestly don't understand why, as the output is standard markdown.

devmotion commented 4 years ago

Do you have a reproducible example?

torfjelde commented 4 years ago

Well, it's quite annoying to replicate but here it is:

  1. ] dev TuringTutorials and checkout tor/jmd-weaving
  2. Go to ~/.julia/dev/TuringTutorials and run julia --project -e "using TuringTutorials; TuringTutorials.weave_folder("introduction", (:github, ))" for testing
  3. Change the md_path in copy_tutorial in Turing/docs/make-utils.jl to HOME/.julia/dev/TuringTutorials (you have to replace HOME manually)
  4. julia --project make.jl
  5. bundle exec jekyll serve
devmotion commented 4 years ago

Should it be tor/jmd-files? I can't find tor/jmd-weaving

torfjelde commented 4 years ago

Nah, tor/jmd-weaving should be there. Alternatively, you could try jmd-staging which should also work.

EDIT: I think you should use jmd-staging.

devmotion commented 4 years ago

I finally managed to obtain a local preview of the docs, and formatting seems to be fine? It took me a while to figure out that the new setup (or maybe just my weird way of changing make-utils.jl) leads to different paths, and I had to go to http://localhost:4000/dev/tutorials/01-introduction/ instead of just following the links to the first tutorial. So math rendering and text formatting seems fine, it seems there's just some problem with the images and gifs (they are copied to the docs folders but instead of displaying them some text is shown on the page).

devmotion commented 4 years ago

Regular code output works fine though.

torfjelde commented 4 years ago

Really? On my end it looks super-strange. Hmm, well, that's good news!

Yeah, I think we have to change where the figures are outputed.

torfjelde commented 4 years ago

Btw, I realized I had a mistake in the step-by-step guide above. You also need to make sure that the files are "lifted" to the top-level in _tutorials, i.e. copy_tutorial needs to be

function copy_tutorial(tutorial_path)
    isdir(tutorial_path) || mkpath(tutorial_path)
    # Clone TuringTurorials
    tmp_path = tempname()
    mkdir(tmp_path)
    # clone("https://github.com/TuringLang/TuringTutorials", tmp_path)

    # Move to markdown folder.
    md_path = joinpath("/home/tor/.julia/dev/TuringTutorials/", "markdown")

    # Copy the .md versions of all examples.
    try
        @debug(md_path)
        # recursively find files and copy to top-level in `tutorial_path`
        for (curr_dir, dirs, files) in walkdir(md_path)
            for file in files
                full_path = joinpath(md_path, curr_dir, file)
                target_path = joinpath(tutorial_path, file)
                println("Copying $full_path to $target_path")
                cp(full_path, target_path, force=true)
                if endswith(target_path, ".md")
                    # remove_yaml(target_path, "permalink")
                    fix_header_1(target_path)
                    fix_image_path(target_path)
                end
            end
        end
        index = joinpath(@__DIR__, "src/tutorials/index.md")
        cp(index, tutorial_path * "/index.md", force=true)
    catch e
        rethrow(e)
    finally
        rm(tmp_path, recursive=true)
    end
end

But now it works nicely for me to!

A couple of issues:

  1. Stuff like {julia; eval = false} at the beginning of code-blocks doesn't seem to be supported by Jekyll (it's apparently supported by Github though).
  2. Figures aren't referenced correctly (as @devmotion also mentioned)
  3. Naming convention is different. Personally I quite like the convention from SciMl, where you have folders representing different categories. But at the same time, just to get this done asap, it might be preferable to just change the weave-process to use a flattened structure.
torfjelde commented 4 years ago

"Fixed" (2) and (3) locally, but (1) I can't seem to be able to fix. Even tried using what's supposed to be GFM markdown parsers in Jekyll, but none of them actually handle {julia; eval = false} correctly.

torfjelde commented 4 years ago

Also there's a question of whether or not we really want to change the filenames and thus the links to the tutorials, given that there probably is a fair bit of links to the tutorials out in the wild.

devmotion commented 4 years ago

Figures aren't referenced correctly (as @devmotion also mentioned)

Also there's a question of whether or not we really want to change the filenames and thus the links to the tutorials, given that there probably is a fair bit of links to the tutorials out in the wild.

I guess probably the easiest might be to change the out_dir and set the fig_path arguments in the weave command.

torfjelde commented 4 years ago

I guess probably the easiest might be to change the out_dir and set the fig_path arguments in the weave co

Actually only need to change out_dir, and yeah, that's exactly what I did:)

dataning commented 4 years ago

Hi, I am just following the step-by-step in Bayes Logistic tutorial on the Turing.jl's website. But I run into an error in the prediction step. Does this have anything to do with the outdated tutorial or something I miss out in the set up? I'm using Julia 1.5.2 and the latest Turing.jl.

julia> predictions = prediction(test, chain, threshold)
ERROR: type AxisArray has no field value
Stacktrace:
 [1] getproperty(x::AxisArrays.AxisArray{Float64, 2, Matrix{Float64}, Tuple{AxisArrays.Axis{:iter, StepRange{Int64, Int64}}, AxisArrays.Axis{:chain, UnitRange{Int64}}}}, f::Symbol)
   @ Base ./Base.jl:33
 [2] prediction(x::Matrix{Float64}, chain::Chains{Float64, AxisArrays.AxisArray{Float64, 3, Array{Float64, 3}, Tuple{AxisArrays.Axis{:iter, StepRange{Int64, Int64}}, AxisArrays.Axis{:var, Vector{Symbol}}, AxisArrays.Axis{:chain, UnitRange{Int64}}}}, Missing, NamedTuple{(:parameters, :internals), Tuple{Vector{Symbol}, Vector{Symbol}}}, NamedTuple{(), Tuple{}}}, threshold::Float64)
   @ Main ./REPL[38]:3
 [3] top-level scope
   @ REPL[40]:1
devmotion commented 4 years ago

@torfjelde Any updates on this issue?

torfjelde commented 4 years ago

Sorry for the delay; submission deadline a couple of days ago.

Naming convention is different. Personally I quite like the convention from SciMl, where you have folders representing different categories. But at the same time, just to get this done asap, it might be preferable to just change the weave-process to use a flattened structure.

This should be the only issue to resolve I think. It's just a question of whether we want a flat structure, and revert the names to the old one, or if we also revamp the project structure in the process.

devmotion commented 4 years ago

IMO it would be nice to move away from the flat structure but we don't have to do this right away if it is more difficult. However, as far as I understand, it would be even simpler to not have a flat structure, and in this case I would suggest to change it as part of the updates.

As mentioned earlier, probably different webpages/Google/twitter posts/etc. still link to the tutorials using the flat structure, so it might be good to add manual redirects from the old locations for a while.

torfjelde commented 4 years ago

However, as far as I understand, it would be even simpler to not have a flat structure, and in this case I would suggest to change it as part of the updates.

Correct + we definitively need non-flat structure for the jmd files so that we can have a Project.toml file for the different tutorials. Right now we just "flatten" the structure in copy_tutorials, which is fine IMO. But then the "numbering" of the names doesn't make much sence, e.g. linear-regression/01.jl and hidden-markov-models/01.jl will both be moved to the same file, which of course can cause issues (stupid example, but you get the idea).

Maybe we should just have a "flat" structure in the sense that in TuringTutorials.jl we have tutorials/01-introduction/, tutorials/02-linear-regression/ and then we use the directory-names as the end of the URL? This would also allow further fine-tuning of dependencies.

As mentioned earlier, probably different webpages/Google/twitter posts/etc. still link to the tutorials using the flat structure, so it might be good to add manual redirects from the old locations for a while.

Great idea :+1: We should do this regardless, as I've done some renaming (which I think is justified, e.g. moving to 01 rather than 1 in the beginning of the tutorial name).

torfjelde commented 4 years ago

Stuff like {julia; eval = false} at the beginning of code-blocks doesn't seem to be supported by Jekyll (it's apparently supported by Github though).

Also, this is an issue. Ideally we'd have the same rendering as Github, but in the meantime I guess we just try to use regex to remove the braces from those?

cpfiffer commented 4 years ago

For the braces, can't we just have Weave convert .jmd => .md instead? I thought this would be a requirement to get the plots and stuff generated, since we need markdown files anyway for the website.

torfjelde commented 4 years ago

For the braces, can't we just have Weave convert

Seems like it's considered valid markdown because this is all talking about the markdown generated using Weave.jl :confused:

cpfiffer commented 4 years ago

Oh, my bad. I misunderstood.

devmotion commented 3 years ago

Gentle bump - any progress on this @torfjelde? Anything that we could help with?

torfjelde commented 3 years ago

Sorry, haven't made any progress on this, but the remaining bit is the above, i.e. I'm having trouble with the rendering of the output generated by Weave.jl. Also not particularly familiar with Jekyll, so I was trying out different rendering engines to see why Github manages to render it correctly while our engine can't. And I got super-confused because even the Github Flavoured Kramdown (https://kramdown.gettalong.org/parser/gfm.html) wouldn't even do it correctly :confused:

So really, any help with figuring out how to do this properly would be appreciated. I'll also have another go.

cpfiffer commented 3 years ago

I'm not sure exactly how weave is being called, but historically I set weave(...; doctype="github") which handled all the rendering crap. I ran into the same issue and it sucked.

torfjelde commented 3 years ago

This is using weave(...; doctype="github"): https://github.com/TuringLang/TuringTutorials/blob/024be8669e251b3590a0eaffb2a0a1068b20d4c5/src/TuringTutorials.jl#L48-L54

So this is where it gets weird. It seems that Github supports braces at the beginning of code-blocks, e.g. {julia} rather than julia, but Jekyll doesn't (even when using GFM parser above).

Does anyone know how e.g. SciML deals with this? Or do they just render the actual HTML directly from jmd rather than using an intermediate renderer like Jekyll?

kescobo commented 3 years ago

Does anyone know how e.g. SciML deals with this? Or do they just render the actual HTML directly from jmd rather than using an intermediate renderer like Jekyll?

Looking here, it seems that all of the outputs are generated (regular markdown, html, ipynb) in the main repo, and the links from the README are to plain HTML (one example) without any styling. I'm not sure if the notebooks live somewhere else too, but I haven't seen a place where they have the tutorials integrated into the docs with consistent styling the way that Turing currently does it.

Have you tried manually doing a regex find/replace (^```\{julia\} => ```julia) in the markdown to see if that renders?

torfjelde commented 3 years ago

and the links from the README are to plain HTML (one example) without any styling

Yeah that's what I gathered too.

Have you tried manually doing a regex find/replace (^\{julia\} =>julia) in the markdown to see if that renders?

Nah, and that's of course a possible solution. I just felt like there ought to be a better way :confused: The braces could also include more stuff than just julia, as it's used by jmd to indicate whether or not to evaluate etc. But it should still be possible to write a simple regex to find+replace.

kescobo commented 3 years ago

Nah, and that's of course a possible solution. I just felt like there ought to be a better way

Totally agree, was just thinking it might be good to check whether that's the only thing causing issues.

I'm a bit confused though - it looks like weave can output markdown without braces (eg. see here). Is it just dependent on what the input jmd style is?

The braces could also include more stuff than just julia

And Weave can handle that without braces

Or are you saying that Jekyll needs the braces and you can't find a way to render without them?

But it should still be possible to write a simple regex to find+replace.

If this is something you'd like help with, I kinda love regex puzzles :-)

torfjelde commented 3 years ago

I'm a bit confused though - it looks like weave can output markdown without braces (eg. see here)

Okay, so that's a bit strange. Because when I used weave(...; doctype="github"), lines such as https://github.com/JunoLab/Weave.jl/blob/c9fc740d72dbbd2f0b950f98efedc5ab71f17e3e/examples/FIR_design.jmd#L25-L26 did indeed show up in the generated document. It might be because I added braces {...} around the setup, but that should be supported: http://weavejl.mpastell.com/stable/chunk_options/#Syntax.

Hmm, I'll give it another go and see if it works properly now.

If this is something you'd like help with, I kinda love regex puzzles :-)

Haha, I'll keep that in mind!

devmotion commented 3 years ago

Just some random thing I discovered (but I guess I should play around with it first to get a better feeling for the problem): have you tried to use redcarpet instead of kramdown? Maybe that can handle GFM in a better way, see e.g. https://stackoverflow.com/questions/13464590/github-flavored-markdown-and-pygments-highlighting-in-jekyll/13614020#13614020.

Edit: Apparently it's not supported anymore (https://github.com/jekyll/jekyll/issues/7838).

devmotion commented 3 years ago

I guess this looks better: https://github.com/github/jekyll-commonmark-ghpages or https://github.com/jekyll/jekyll-commonmark

And I found this issue: https://github.com/gettalong/kramdown/issues/503 In the end it was resolved/circumvented with a regex :smile:

devmotion commented 3 years ago

@torfjelde I switched to CommonMark and I couldn't observe any problems with the code blocks when following your steps in https://github.com/TuringLang/TuringTutorials/issues/86#issuecomment-705063442. Apart from the figures for which you said it is sufficient to specify out_dir, http://127.0.0.1:4000/dev/tutorials/01-introduction/ looked fine (somehow there's an import problem when displaying the TuringTutorials information at the bottom of the page, but maybe that's a problem with my weird local setup).

So the only changes that I applied were: Adding

group :jekyll_plugins do
  gem 'jekyll-commonmark'
end

to Turing/docs/Gemfile and replacing markdown: kramdown with markdown: CommonMark in Turing/docs/_config.yml

torfjelde commented 3 years ago

Ah, super neat! I'll give that a go too!