JunoLab / Weave.jl

Scientific reports/literate programming for Julia
http://weavejl.mpastell.com
MIT License
824 stars 95 forks source link

Reworking format.jl #350

Closed JonasIsensee closed 4 years ago

JonasIsensee commented 4 years ago

Hi, addressing #345 This PR implements a super simple way to make exports for tex and texminted work. I'm not saying that this is the best way to do it but I just started this PR as a proof of concept.

It may be worth reorganizing the types around the docformats to account for the similarities (such as shown here).

aviatesk commented 4 years ago

nice, thanks for your try. so can this PR produce "valid" tex files for tex and texminted formats, as is ?

If so, I'm totally okay to merge this, while it would be super great if you could tell any problem you notice.

It may be worth reorganizing the types around the docformats to account for the similarities (such as shown here).

sounds good, might be good to start in another PR, tho ;)

JonasIsensee commented 4 years ago

nice, thanks for your try. so can this PR produce "valid" tex files for tex and texminted formats, as is ?

If so, I'm totally okay to merge this, while it would be super great if you could tell any problem you notice.

Yeah, in my initial tests this did seem to produce a valid .tex file. But I'll check again to make sure.

EDIT: actually no. texminted needs, for obvious reasons, the latex package minted which would have to be added in the template .tpl file. However currently there is no good way to include a different template file for texminted by default..

Also, given that the tests now fail it seems that the old behaviour was intended, though it's not clear what for.

aviatesk commented 4 years ago

great,

However currently there is no good way to include a different template file for texminted by default..

so the loading of minted is the only modifcation that should be added to our existing template ? If so I have an idea to reuse the existing template in a way we add minted only when we weave to texminted format.

Also, given that the tests now fail it seems that the old behaviour was intended, though it's not clear what for.

yeah we have tons of old and very fragile (and annoying tbh) end2end tests, and I guess this PR can fail some of them, but don't care it. We can manually update them later (or even remove them). see #326 for the context if you want.

aviatesk commented 4 years ago

fwiw, rebasing against master will make it easier to run tests locally (skip Plots.jl and Gadfly.jl integration tests, which would take long time to finish otherwise)

JonasIsensee commented 4 years ago

so the loading of minted is the only modifcation that should be added to our existing template ? If so I have an idea to reuse the existing template in a way we add minted only when we weave to texminted format.

Yeah, I also thought about that. Right now the md2pdf.tpl loads the listings package. One could either copy the file and edit it or alternatively find a way to dynamically add the corresponding line. On the other hand this does not scale well, if we end up needing to edit more lines...

Maybe it would make sense to make each .formatdict for the various output formats contain a path to the default template.

get_template(::Nothing, tex::Bool = false) =
    Mustache.template_from_file(normpath(TEMPLATE_DIR, tex ? "md2pdf.tpl" : "md2html.tpl"))

that would allow us to remove this hardly extendable piece of code.

aviatesk commented 4 years ago

alternatively find a way to dynamically add the corresponding line.

yep, we can do that, see this README section in Mustache.jl, a template engine we use.

I pushed some commits, and now texminted::TexMinted generates .tex files which loads minted, ... but I found they're not really valid tex files. In other word, they can't be compiled, since now format_hoge (e.g. format_chunk) produces some tex string, but all of them assume that they use listing package. So the next thing we need to do is to overload those functions with TexMinted format to produce tex strings, which can be compiled when joined.

aviatesk commented 4 years ago

okay, I updated this PR to #355 . Sorry if the previous rebasing makes you unable to push/pull this PR. Deleting your local branch and pulling again from your repository will make it work. I will make sure I merge master rather than rebasing in the future ;)

JonasIsensee commented 4 years ago

No worries.

So I'm restructuring the files right now and moving code back and forth.

I really don't like the fact that we have these weird types that are all the same because they just contain a dictionary. This seems like an unnecessary bit of indirection.

Since no part of Weave.jl is performance critical (and since the current approach isn't exactly type stable either) I propose that we do away with the dictionary in between and define untyped structs with many fields leveraging Base.@kwdef for instantiation.

Somewhere (probably at the very beginning) we would then have to @assert that the types have all the fields that are needed but then everywhere else in the code it would make lookups simpler. docformat.formatdict[:property]docformat.property

EDIT: all these asserts are only necessary for user passed types. But this is not the intended usage anyway, since then she would need to implement all the methods anyway? The more likely/intended case is that users pass a modified instance of one of the implemented types. Any then Base.@kwdef does asserts along with defaults for us.

@aviatesk if you disagree, please say so. If not, I will start working on that tomorrow.

JonasIsensee commented 4 years ago

My thoughts on the following lines of code. (at Weave.jl line 210 ish) This is really un-julian style.

doctype = doc.doctype
    if doctype == "pandoc2html"
        mdname = outname
        outname = get_outname(out_path, doc, ext = "html")
        pandoc2html(rendered, doc, highlight_theme, outname, pandoc_options)
        rm(mdname)
    elseif doctype == "pandoc2pdf"
        mdname = outname
        outname = get_outname(out_path, doc, ext = "pdf")
        pandoc2pdf(rendered, doc, outname, pandoc_options)
        rm(mdname)
    elseif doctype == "md2pdf"
        run_latex(doc, outname, latex_cmd)
        outname = get_outname(out_path, doc, ext = "pdf")
    end

I think we could instead implement a postprocessing function that generally does nothing but runs the external commands for the formats for which something else is defined.

post_processing!(::WeaveFormat) = nothing
post_processing!(::Pandoc2HTML) = stuff
JonasIsensee commented 4 years ago

Oh, I found a proper issue that will make working with minted and tex harder..

https://github.com/JuliaDocs/Highlights.jl/blob/fc1d9494f06a3ec9db923cda7c9369b63cc96e44/src/Format.jl#L121

Highlights.jl works with lstlisting and only with that. If we want to make texminted available at all we will have to disable Highlight themes with that.

JonasIsensee commented 4 years ago

ok, so I'm finally starting to understand what is happening here: In md2tex:

This means that if we want to make texminted and tex output valid tex files we have to skip the highlight part. That is definitely a downside but without implementing new highlighting themes for Highlights.jl this seems to be the only option.

JonasIsensee commented 4 years ago

By now a lot of files have changed. Here's a quick summary:

JonasIsensee commented 4 years ago

Things that remain to be done:

Things like https://github.com/mauro3/Unpack.jl could also shorten the code in some places.

aviatesk commented 4 years ago

Thanks for your work on this, again. Most of your suggestions sound so nice.

Before discussing details, I would like to ask if you would mind splitting this PR into separate PRs, one for general refactoring format/rendering logic, one for actually improving Tex output format ? That way, we can see if your refactoring keeps backward compats or breaks something, etc. Having said that, we don't necessary keep backward compats, and I would really want to your refactoring to get in. It will just make it easier to review/merge easily.

aviatesk commented 4 years ago

Here are comments on refactoring parts:

EDIT: all these asserts are only necessary for user passed types. But this is not the intended usage anyway, since then she would need to implement all the methods anyway? The more likely/intended case is that users pass a modified instance of one of the implemented types. Any then Base.@kwdef does asserts along with defaults for us.

I like it. Actually I had the exact same idea before (maybe a week ago) but at that time I withdraw it for some reason. I forgot the detail already but I guess at that time we did something like merge!(formatdict, something) in somewhere, but now I can't find it anywhere and I also don't think such a merging and modifying instantiated formats are necessary, so let's go that way.

I think we could instead implement a postprocessing function that generally does nothing but runs the external commands for the formats for which something else is defined.

sounds nice.

Implement better argument parsing to properly support the new structure

yeah, we can do smarter processing of rendering-related arguments/options according to output formats.

make preserving header a field of the format instead of const HEADER_PRESERVE_DOCTYPES = ("github", "hugo")

yeah, sounds good.

JonasIsensee commented 4 years ago

hm, that is a very good point. I'm not super sure on how to best split this stuff into different PRs though.. I'll give it a try.

aviatesk commented 4 years ago

It's really okay to include some tex-related stuff into the refactoring PR; like type definition, etc, if it's difficult to separate from this. The only purpose of splitting is to try to make it easier to see the diffs, so please take it easy.

JonasIsensee commented 4 years ago

ok, it turns out that just about all of my code changes concern the refactoring. The last commit that I pushed just now reverts all explicit improvements on the tex pipeline to the original (albeit buggy) behaviour.

I will happily resubmit those changes as a new PR once this one is merged.

aviatesk commented 4 years ago

great

I will happily resubmit those changes as a new PR once this one is merged.

if you want you can create another PR on top of this branch. If you set base branch to this PR, the diff will be taken from here.

JonasIsensee commented 4 years ago

if you want you can create another PR on top of this branch. If you set base branch to this PR, the diff will be taken from here.

Ah, ok. I've never done that before. I'll give it a try

JonasIsensee commented 4 years ago

So I created a new PR but since the branch of this PR lives in my own fork of Weave, the new PR lives in my fork entirely... This is not what we wanted, no?

https://github.com/JonasIsensee/Weave.jl/pull/1

aviatesk commented 4 years ago

So I created a new PR but since the branch of this PR lives in my own fork of Weave, the new PR lives in my fork entirely... This is not what we wanted, no?

ah, sorry, that is only possible if you can push a brach into this repository, so let's do tex stuff in another PR after this PR gets merged into master.

aviatesk commented 4 years ago

I pushed two commits into this PR since I can't make another PR merging into this PR: each does:

So can you kindly review on this ?

aviatesk commented 4 years ago

And, imho, this kind of refactoring PR should be merged soon as possible (otherwise we may start to feel really dull on this kind of stuff), so I think I will move to quickly merge this if you approve and we pass tests.

JonasIsensee commented 4 years ago

Ah, you started outsourcing the argument handling already. Nice!

I like your changes. I think we can merge.

aviatesk commented 4 years ago

@JonasIsensee okay, I will move to merge this once CI passes. We've dropped LTS compatibility support, but who complains.

Looking forward to your another tex improvements PR on top of this ! (well, take your time, as always ;))