Rapporter / pander

An R Pandoc Writer: Convert arbitrary R objects into markdown
http://rapporter.github.io/pander/
Open Software License 3.0
294 stars 65 forks source link

Support for class Hmisc::summaryM or Hmisc::summary.formula.reverse #235

Closed tormodb closed 8 years ago

tormodb commented 8 years ago

The summaryM function in the Hmisc package by Frank Harrel is a great way to produce summary tables of continous and categorical variables by group. The objects created are of class summaryM (or summary.formula.reverse). It would be great if pander could support this class in a further update.

daroczig commented 8 years ago

Hey @RomanTsegelskyi, you already have some experience with Hmisc objects -- can you please have a look at this suggestion when you have a chance?

romantseg commented 8 years ago

@daroczig, sure, I will try to look into it this weekend if I have time

romantseg commented 8 years ago

@tormodb I had a look at Hmisc::summaryM and I have couple of questions:

Also it seems to be that default print already produces somewhat similar to grid style in pander. While I understand that it not full support, I am just curious to know what are the primary features you are looking to get out of pander support for Hmisc::summaryM

In any case, I will probably implement it this or next week

Descriptive Statistics  (N=500)

+------------------------------+----+----------------------+----------------------+------------------------------+
|                              |N   |Drug                  |Placebo               |  Test                        |
|                              |    |(N=260)               |(N=240)               |Statistic                     |
+------------------------------+----+----------------------+----------------------+------------------------------+
|age                           | 500|        46.6/49.9/53.0|        46.7/50.1/53.3|   F=0.58 d.f.=1,498 P=0.445  |
+------------------------------+----+----------------------+----------------------+------------------------------+
|sex : m                       | 500|          0.44 (114)  |          0.47 (113)  |Chi-square=0.53 d.f.=1 P=0.468|
+------------------------------+----+----------------------+----------------------+------------------------------+
|Systolic BP [mmHg]            | 499|          112/120/127 |          112/120/128 |   F=0.06 d.f.=1,497 P=0.81   |
+------------------------------+----+----------------------+----------------------+------------------------------+
|Primary Symptoms : Muscle Ache|2500|          0.49 (128)  |          0.52 (126)  |Chi-square=0.53 d.f.=1 P=0.465|
+------------------------------+----+----------------------+----------------------+------------------------------+
|    Stomach Ache              |    |          0.50 (130)  |          0.49 (117)  |Chi-square=0.08 d.f.=1 P=0.780|
+------------------------------+----+----------------------+----------------------+------------------------------+
|    Headache                  |    |          0.51 (133)  |          0.51 (122)  |Chi-square=0.01 d.f.=1 P=0.943|
+------------------------------+----+----------------------+----------------------+------------------------------+
|    Hangnail                  |    |          0.44 (115)  |          0.52 (125)  |Chi-square=3.08 d.f.=1 P=0.079|
+------------------------------+----+----------------------+----------------------+------------------------------+
|    Depressed                 |    |          0.48 (125)  |          0.49 (118)  |Chi-square=0.06 d.f.=1 P=0.808|
+------------------------------+----+----------------------+----------------------+------------------------------+
tormodb commented 8 years ago

@RomanTsegelskyi Great that you had a chance to look at it.

I am not sure what you mean by exactly the same, but for example for a classic table of descriptives in a scientific manuscript, the default format is just right. There are many options for summary.table and summaryM, but many of this relate to latex output. Just having the output you produced in the example, but using markdown table format (understandable by knitr) would be great.

The primary purpose would be to be able to make it into something convertible to html-output (i.e. markdown formatting instead of grid), so it would be possible to use it for example with knitr in html, or doc-formats or as html slideshows.

daroczig commented 8 years ago

@tormodb FTR the above grid-like table is already a ~valid markdown table, which can be transformed to HTML by pandoc (which is used in RStudio and rmarkdown) -- rendering something like:

hmisc-grid

tormodb commented 8 years ago

Really? That is great news, but I have not been able to get that to work. When I try to pass these tables to pander I get class not supporter (or similar) error. How would you pass a summaryM or summary.formula.reverse object to pander and get that into an object suitable for knitr?

On Wed, Dec 16, 2015 at 10:50 AM -0800, "Gergely Daróczi" notifications@github.com wrote:

@tormodb FTR the above grid-like table is already a ~valid markdown table, which can be transformed to HTML by pandoc (which is used in RStudio and rmarkdown) -- rendering something like:

— Reply to this email directly or view it on GitHub.

daroczig commented 8 years ago

@tormodb if you are happy with the above results, then no need to pass the object to pander, as the default print function returns that grid-like table. So I think printing should be enough (with setting results to asis in the `knitr chunk).

tormodb commented 8 years ago

Doh! You are of course right, I didn't think that pandoc could actually do this directly, and thought that I had to pass this trough pander or kable to make it output to markdown. There is only one cosmetic challenge in that the percentages are put into "block" with grey shade, I don't know if this is an css-issue or where this formatting comes from. (see attachment)

shades

and it is... I was applying theme:cosmo, and that is where the shading comes from, when I run it through knitr theme-less, this work well.

Sorry about this, and thank you very much for helping me out!!

daroczig commented 8 years ago

I think the shades come from the indention in those cells -- markdown interprets this as 4-space rule, so considers the numbers as code in the grid-table. @RomanTsegelskyi might have a workaround for this in the forthcoming weeks.

romantseg commented 8 years ago

Sorry for postponing it for so long because of holidays.

So, I finally looked into implementing in as pander.summaryM method, and I am a hesitant to do that, since the that will almost identically duplicate Hmisc::print.summaryM. We had similar discussion about #207, but for those methods there was no markdown rendering, while for summaryM there is something close.

Inspecting code closer, the line that essentially needs to be deleted (or moved under some option) is https://github.com/harrelfe/Hmisc/blob/master/R/summaryM.s#L535, so I think it might be better to log a feature request for Hmisc.

@daroczig @tormodb what do you think? If we still decide to implement pander.summaryM, I have the needed code.

daroczig commented 8 years ago

I totally agree: maintaining a copy of the Hmisc printing code in pander might be a real pain in the long run (as already discussed this in #207 and already had some issues with this approach), and post-processing the markdown table will be similarly painful, so if it's something can be handled in Hmisc, then let's keep it simple on our side.

@tormodb what do you think? As @RomanTsegelskyi suggested, creating a ticket in Hmisc to optionally not add leading spaces in the cells might be the easiest solution.

tormodb commented 8 years ago

This is probably a good solution, I am guessing the leading spaces in Hmisc may be related to the latex formatting option. It would be good to have this in pander, but I appreciate the complications related to maintaining that in future updates of Hmisc and other packages. Thanks for being so responsive and helpful though, and for a great package!

daroczig commented 8 years ago

Based on the above discussion, I am closing this, as hopefully Hmisc will be able to provide the required output. Please reopen if that wouldn't be the case and we can try to come up with an alternative solution after all.