Improved Support for Literate Programming

jfdm commented 9 years ago

On the Idris Wiki I have detailed several ideas of how we can improve support for literate programming within Idris. See:

https://github.com/idris-lang/Idris-dev/wiki/Egg-%236%3A-Improved-Support-For-Literate-Programming

This ticket just surves as a place where we can discuss the RFC, and people can propose changes,

daira commented 9 years ago

I'd argue to do Phase 5 (use the documentation system's filetypes) immediately. It provides much better compatibility with the doc system. If you want to indicate that, say, a markdown file is literate Idris, you can always use, say, "foo.idr.markdown" as a convention, but the ".idr" does not need to be required.

david-christiansen commented 9 years ago

Rather than supporting more formats, it seems to me that we can support almost arbitrary formats by allowing a quick specification of them.

As far as I know, most literate formats fall into two classes: those that start each line with a marker (eg Bird style) and those that fence off code blocks with markers (eg LaTeX-style). The former could be supported by having a parameter that specifies (either as strings or as regexps) what code lines can begin with, and the latter could be supported by having a parameter that specifies start/end regexps, where entire lines must match.

For instance, we could have something like the following in a file called "latex.idrisliterate":

[meta]
name = LaTeX

[codeblock]
blockstart = ^[ \t]*\\begin\{code\}[ \t]*$
blockend = ^[ \t]*\\end\{code\}[ \t]*$

[codeblock]
blockstart = ^[ \t]*\\begin\{spec\}[ \t]*$
blockend = ^[ \t]*\\end\{spec\}[ \t]*$

and then Idris could take, as you suggest, a --literate=latex command-line option. Though, even better, I'd prefer that it examine the first few lines of the file for a literate style spec, kind of like Emacs's -*- comments. It would be flexible, so it could work with the native comments of the literate file, like:

% -*- idris-literate: latex -*-

Then, Idris could search the include path for the literate style specification, and if we use some common, easy file format, then development tools like idris-vim or idris-mode should be able to derive highlighting support for literate files from the format as well.

Ideally, this will result in an unlit library that's generally applicable, rather than a bunch of extra code inside Idris proper.

david-christiansen commented 9 years ago

Alternatively, we could take the easy way out and use @pepijnkokke's unlit library: https://github.com/pepijnkokke/unlit .

He's done a lot of the hard work! I've spoken to him about this as well, and he seemed open to helping out if it turns difficult.

wenkokke commented 9 years ago

@david-christiansen: I'd definitely be willing to help out, at least by giving some pointers on how to use the library. Integrating it shouldn't be too hard, since it's just a pre-processor for Text or String. It may take some minor work if you use ByteString internally.

david-christiansen commented 9 years ago

@pepijnkokke Thanks for the offer of pointers! Using your library might get us 80% of the benefit of @jfdm's proposal with 10% of the work.

jfdm commented 9 years ago

@david-christiansen I agree that hard-coding support for literate styles is a bad idea, and I am open for adding in a file specification format.

I actual thought about adding it to the specification, as it is a natural improvement on the original idea. I think my reasoning behind only specifying a hardwired solution, was to get the code constructs in first, and then to possibly think about generalisation and specification for an file format.

If you want you can extend the proposal with your suggestions, if not i will at some point.

@pepijnkokke From my initial look at you unlit library, I agree with @david-christiansen that it looks like it is just what we need.

burdges commented 9 years ago

As an aside, there are situations where one wants to literate program in multiple languages, but built in languages tools rarely address this, especially if the main language doesn't require tangle, like Idris, Haskell, etc. In that case, you might want different types of code blocks and to run some through another tool. Afaik the easiest way to do this is with perl scripts like this cat_latex_env one :

#!/usr/bin/perl
use strict;
use warnings;

sub usage { die "Usage: cat_latex_env enviroment_name [filename]\n"; }
usage if ($#ARGV < 0);

my $env = shift;
my $begin = quotemeta "\\begin{$env}";
my $end = quotemeta "\\end{$env}";
while (<>) {
        if (/$begin/../$end/) {
                next if /$begin/ || /$end/;
                print;
        }
}

I believe this Makefile rule or similar should do :

%.tex.idr : %.tex 
    ./cat_latex_env idris $(input) > $(output)
%.tex.w : %.tex 
    ./cat_latex_env C $(input) > $(output)

And obviously you could use this hack to write literate Idris code now. ;)

cbiffle commented 8 years ago

One aspect of literate programming support that I don't see in many tools is support for presenting code out of its required order.

Haskell gets a bit of a pass on this at the declaration level, since the order of declarations within a module is not significant. Since declaration order is important in Idris this is more of an issue.

However, even with literate Haskell, one cannot defer the implementation of a function until later in the document. Knuth's original literate programming systems supported this through the use of macros, e.g.

myFunction = if condition then <<hard algorithm>> else 0

-- time passes

<<hard algorithm>> = thing I can explain now

(Aside: this feels very similar to the use of metavariables with detached proofs in Idris.)

Anyway. I wanted to raise this since I haven't seen it mentioned in the discussion of literate Idris. Literate programming is more than intercalating a linear text with a linear program.

idris-lang / Idris-dev

Improved Support for Literate Programming #1736