Rapporter / pander

An R Pandoc Writer: Convert arbitrary R objects into markdown
http://rapporter.github.io/pander/
Open Software License 3.0
294 stars 65 forks source link

fix #248 return character vector for Jupyter #251

Open daroczig opened 8 years ago

codecov-io commented 8 years ago

Current coverage is 77.00%

Merging #251 into master will decrease coverage by -0.10% as of fbf5359

Powered by Codecov. Updated on successful CI builds.

jankatins commented 8 years ago

That should work(currently without a Laptop). Will you also add a ’repr_markdown.pander_output(...)’?

daroczig commented 8 years ago

@JanSchulz, not sure, but my intuition suggests that repr_markdown.pander_output might be better added to and maintained in your repr package -- you know what you want to do with the pander_output.

jankatins commented 8 years ago

@daroczig It's actually comparable to the knitr return: To make that more clear, your code could now become:

pander(x){
   if (isTRUE(getOption('knitr.in.progress')) || isTRUE(getOption('jupyter.in_kernel')) {
      # Not sure, if there is a reason for the `sink` version....
      stdout = capture.output(UseMethod('pander', x))
      return return(structure(paste(stdout, collapse="\n"), class = 'pander_output'))
   }
   UseMethod('pander', x)
}
knit_print.pander_output(x){
    if (isTRUE(panderOptions('knitr.auto.asis')){
         return(knitr::asis_output(as.character(x)))
    } else {
         # will call the print.pander_output below
         print(x)
    }
}
repr_markdown.pander_output(x){
    return(as.character(x))
}
print.pander_output(x){
   cat(as.character(x))
}

It would still ct if not in knitr or the kernel but now the automatic dispatch would do the right thing for the output.

daroczig commented 8 years ago

@JanSchulz as said previously, I refactor might happen on this and pander might (invisibly) return an object with its own class, but it won't happen until around July. Until then, I would like to avoid the burden of maintaining foreign code here -- knitr has the asis_output API, so I do no need to know about the knit_print method. Can we please keep that simple design with repr as well and get back to the original feature request?

pander now returns a string with your requested pander_output class inside of Jupyter. Do you need anything else besides that? If you add repr_markdown.pander_output to your package, I can refactor pander at any time without breaking anything in repr or any other packages.

flying-sheep commented 8 years ago

yes, that’s great! only a small question: why “pander_output” instead of plain “pander”? do you reserve the generic name for a more fitting class or…?

also, actually we don’t have the mission to provide repr_*.<yourclass> for every 3rd party class.

it is perfectly fine to simply put repr_markdown.pander <- unclass in the pander package. (or ….pander_output <- …, if you stay with that name)

jankatins commented 8 years ago

@daroczig Sorry, I didn't want to annoy you. I got these knitr things from the documentation at https://cran.r-project.org/web/packages/knitr/vignettes/knit_print.html#for-package-authors in the "For package authors" section.

If you add the "return an object", like @flying-sheep I would very much like you to include the small repr_markdown.pander_output function (or however you want to name the class) in your package because a) we can't build repr methods for everything in our packages (same as knitr does not define such function for everything but asks for package authors to include them in their packages -> that's what the S3 generic function dispatch is actually for: knitr/repr defines the API and you are using it to extend it for your data) and b) of you ever change your mind later, we would have to coordinate the move.

Adding these method in your package, you don't need to import anything, you just have to export these functions.

I would also like to ask you to include a print.pander_output() method (defined as above) so that print(pander(...)) works out of the box in the kernel.

daroczig commented 8 years ago

@JanSchulz no offense taken :) I just won't have much time in the next few months to do major updates on this package.

Thanks for the link on the knitr vignette, that's a good resource. But as using the asis_output fn resulting in an object with the knitr_asis class, there's no need to define any new method for knitr_print, as knitr already knows how to handle that.

So I thought a similar solution should work within repr as well, as you probably already had a method for printing text, but I've just realized that character is prepared for vectors, where you provide the HTML/tex/md version of lists or similar -- that I don't really understand.

Sorry for the silly question, but before figuring out the optimal next steps in short & long terms, why are you writing and maintaining the code for transforming R objects into HTML, tex or markdown? There's xtable and a bunch of other packages out there doing this and eg pander for markdown. So instead of having https://github.com/IRkernel/repr/blob/master/R/repr_matrix_df.r, you could simply call xtable for HTML or tex, and eg pander in a markdown notebook. Eg the pander way of writing reproducible reports is that you never call pander in your Rmd, as pander will be automatically applied on all the R objects.

Anyway, probably you have good motives to rewrite those stuff from scratch -- back to my original question that ended up in the above wall of text (sorry for that): can you please describe me the need of print.pander_output inside of Jupyter? Or even better, can you please add a minimal example on the different modes pander might be used inside of Jupyter and what's your required result? That would be extremely helpful.

jankatins commented 8 years ago

I think the main reason for this PR is that users using pander(model1) will get the same output as in knitr (e.g. the formatted one, nor plain MD source).

But as using the asis_output fn resulting in an object with the knitr_asis class, there's no need to define any new method for knitr_print, as knitr already knows how to handle that.

The problem here is that in the knitr case, you know about the knitr way (e.g, the structure), but knitr does not need to know about the pander way. If we have to implement a repr_mardown.pander_output, we need to know the pander way (by implementing a repr_markdown.pander_output). In my understanding, there is no difference in implementing a structure according to the knitr API or a knit_print method according to the knitr API: in both cases pander uses the API of knitr.

So I thought a similar solution should work within repr as well, as you probably already had a method for printing text, but I've just realized that character is prepared for vectors, where you provide the HTML/tex/md version of lists or similar -- that I don't really understand.

Here it's more a problem that there could be other character vectors (from other methods or simple a "String"), which are just that: strings which should be handled as "normal" strings and not as markdown source code. To distinguish, we need to the structure with a special name.

The way we currently handle "output" goes like this: For all output objects (x) we call repr_markdown(x) (and some more repr_xxx methods -> see below) on them, which in turn calls repr_markdown.class_name(x). knit_print works the same with the exception that it also defines knit_print.knit_asis() which simply returns the input. So by returning a knit_asis structure you get knit_printed that way. We could also define a markdown structure and you could then return that but up to now we felt that just implementing one way was easier.

print.pander_output is needed so that a user who wants to show the markdown code itself can run print(pander(...)) and get the markdown code printed.

why are you writing and maintaining the code for transforming R objects into HTML, tex or markdown?

Thats a good question and I actually think it makes sense to use pander/xtable for things like data frames and matrix: https://github.com/IRkernel/repr/issues/35

Eg the pander way of writing reproducible reports is that you never call pander in your Rmd, as pander will be automatically applied on all the R objects.

I'm not sure if the users wants that: having a table nicely styled is something differently than having a table for summary output. E.g. currently it's as far as I undertsand pander a explicit step to get a table in knitpy and not the plain text outputfor a summary (e.g. summary(lm(...)) produces plain text, pander(summary(lm)(...))) the table.

If we owuld want that, we could use it like:

out <- repr_markdown(input)
if (repr_markdown(out) == NULL){
   out = pander(input)
}

This would only work if pander in that mode would return NULL for unsupported types (you currently fall back to treating it as a list).

We could also use your definitions, like

repr_markdown.vector <- pander.vector
repr_markdown.summary.lm <- pander.summary.lm

But this makes IMO only sense for the main types (up to matrix/data.frame). -> https://github.com/IRkernel/repr/issues/35

To give you a bit of background how repr is used: the jupyter R notebook (or actually every jupyter frontend) wants output in multiple formats: plain text (print(x) and things like markdown and html, so we send a bundle of mimetypes (print(x) becomes text/plain and the output of repr_markdown(x) becomes text/markdown and so on) to the frontend. The bundle contains all mimetypes for which we got a non-NULL return. The frontend then decides which version of the output is used to actually display the output ( e.g. the Notebook uses html > markdown > plain text", the console uses only plain text, or knitpy (a knitr clone), which usesmarkdown > html > plain text`).

It would actually be nice to merge knitrs and our approach, but currently we can't:

So the idea is it to unify these two approaches by making repr useable in knitr and ask for repr to be included in the knit_print.default() method.