harrelfe / Hmisc

Harrell Miscellaneous
Other
205 stars 81 forks source link

Feature: Add latex method for formula (and classes used by formula) #124

Open billdenney opened 4 years ago

billdenney commented 4 years ago

As discussed by email, I'm preparing a PR to add a latex.formula() S3 method. Because formulae contain many other classes, it will also include .name, .call, .(, .character, .numeric, and .logical classes. I have several questions that I thought would be easiest to handle by discussion in an issue prior to code review in the PR.

latex_character Method

For my use case, I need to be able to have the output as a character string (or character vector) rather than in a file. That said, I see the utility of file output. Would you consider a PR that adds a new generic latex_character() that does the same thing as latex() but returns the character string instead of the file? If that would be OK, I would then make a latex_character.default() that would capture the file output to a textConnection() very similar to the following (though some revisions would likely be required in other functions to support textConnections:

``` r
library(Hmisc)
text_value <- character(0)
con <- textConnection(object="text_value", open="w")
latex(data.frame(A=1), file=con)
#> Error in readLines(fi, n = -1): cannot read from this connection
close(con)
text_value
#>  [1] "%latex.default(data.frame(A = 1), file = con)%"                        
#>  [2] "\\begin{table}[!tbp]"                                                  
#>  [3] "\\begin{center}"                                                       
#>  [4] "\\begin{tabular}{lr}"                                                  
#>  [5] "\\hline\\hline"                                                        
#>  [6] "\\multicolumn{1}{l}{data.frame}&\\multicolumn{1}{c}{A}\\tabularnewline"
#>  [7] "\\hline"                                                               
#>  [8] "1&$1$\\tabularnewline"                                                 
#>  [9] "\\hline"                                                               
#> [10] "\\end{tabular}\\end{center}"                                           
#> [11] "\\end{table}"

Created on 2020-02-18 by the reprex package (v0.3.0)

roxygen Documentation

When I write my packages, I prefer to simplify documentation writing using roxygen2. I know that some people have strongly held opinions on roxygen2. Are you OK if I use roxygen2 to write the documentation?

Tests

I have developed tests to confirm that formula to LaTeX generation is accurate (using testthat), but I don't see testing infrastructure in Hmisc. How would you like these tests to be included in Hmisc?

rmheiberger commented 4 years ago

Thanks for doing latex.formula through the Hmisc pacakge.

  1. I don't understand why you want a character string output rather than a file. Isn't that redundant with the feature that is already there for working directly with Sweave and knitr. search for "Sweave" in ?latex. I have never used that feature. I believe it is Frank's primary usage.

  2. CRAN doesn't accept roxygen2. I have no objection to you writing roxygen2 but the version that gets to CRAN should be the generated .R and .Rd files.

  3. The basic design of latex() is for matrices of numeric or character data. I don't understand what you have in mind as a numeric subset of latex.formula.

billdenney commented 4 years ago

Thanks for the detailed reply. My thoughts are:

  1. I use character string output in more places than .tex files. I occasionally use the character string output in .html files (with MathJax, https://www.mathjax.org/). When I generate my knitr files, I prefer minimizing extra files (personal preference there) and have the character string show up with knitr::asis_output().
  2. I definitely compile the roxygen2 to .Rd. I had a strong preference from the author of another package not to have roxygen2 even in the source, and I wanted to be sure that it was OK here.
  3. I'm not sure that I understand the third comment. You can see my current code here: https://github.com/billdenney/bsd.report/blob/master/R/knit_print.formula.R (I'll be converting that to fit in Hmisc). In formulae, numbers would be represented differently than they would in matrices, so it will need a different method. While I've not worked through this fully based on the structure of Hmisc, I will likely create a new generic latex_formula() which will have to handle different classes within a formula context. As an example, the way that a numeric is handled would be different in a formula context than in a matrix context (in formula context, just the number as a character scalar would be returned).
rmheiberger commented 4 years ago

I ran your code

knit_print.formula(y ~ x^2 + sin(x)) [1] "$y \sim {{x}^{2}}+{sin\left(x\right)}$" attr(,"class") [1] "knit_asis" attr(,"knit_cacheable") [1] TRUE knit_print.formula(y ~ (x^2 + sin(x))/ e^(-x^2)) [1] "$y \sim \frac{\left( {{x}^{2}}+{sin\left(x\right)} \right)}{{e}^{\left( -{x}^{2} \right)}}$" attr(,"class") [1] "knit_asis" attr(,"knit_cacheable") [1] TRUE knit_print.formula(y ~ (x^2 + sin(x))/ e^{-x^2}) Error in knit_print_helper_formula.default(x[[3]], ...) : Cannot handle class(es): { knit_print.formula(y ~ (x^2 + sin(x))/ e^-x^2) [1] "$y \sim \frac{\left( {{x}^{2}}+{sin\left(x\right)} \right)}{{e}^{-{x}^{2}}}$" attr(,"class") [1] "knit_asis" attr(,"knit_cacheable") [1] TRUE

Why are you naming this function with "knit_"? This example is pure latex and unrelated to knit as far as I can see.

On Tue, Feb 18, 2020 at 10:49 AM Bill Denney notifications@github.com wrote:

Thanks for the detailed reply. My thoughts are:

  1. I use character string output in more places than .tex files. I occasionally use the character string output in .html files (with MathJax, https://www.mathjax.org/). When I generate my knitr files, I prefer minimizing extra files (personal preference there) and have the character string show up with knitr::asis_output().
  2. I definitely compile the roxygen2 to .Rd. I had a strong preference from the author of another package not to have roxygen2 even in the source, and I wanted to be sure that it was OK here.
  3. I'm not sure that I understand the third comment. You can see my current code here: https://github.com/billdenney/bsd.report/blob/master/R/knit_print.formula.R. In formulae, numbers would be represented differently than they would in matrices, so it will need a different method. While I've not worked through this fully based on the structure of Hmisc, I will likely create a new generic latex_formula() which will have to handle different classes within a formula context. As an example, the way that a numeric is handled would be different in a formula context than in a matrix context (in formula context, just the number as a character scalar would be returned).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/harrelfe/Hmisc/issues/124?email_source=notifications&email_token=AAP5ZI7FNE63LBC57Y6CQI3RDP7ORA5CNFSM4KXFUM5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMCPSXQ#issuecomment-587528542, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAP5ZI55OHHFFYXN7VRZOWDRDP7ORANCNFSM4KXFUM5A .

billdenney commented 4 years ago

I will convert it to being named latex_formula() instead of knit_print() for Hmisc. I apparently also need to add a French brace method.

The value of knit_print() is that you can use it automatically in place in a knitr document. The conversion will happen behind the scenes. (And, that relates to its use in html as well as LaTeX.)

rmheiberger commented 4 years ago

latex() already has the feature that you can use it automatically in place in a knitr document. As I said before, that is how Frank uses it. See the 'file' argument in ?latex. Again, I haven't used that feature, so I can't be more precise.

How does what you do differ from 'latexTranslate’ and 'htmlTranslate (again see ?latex).?

On Tue, Feb 18, 2020 at 12:55 PM Bill Denney notifications@github.com wrote:

I will convert it to being named latex_formula() instead of knit_print() for Hmisc. I apparently also need to add a French brace method.

The value of knit_print() is that you can use it automatically in place in a knitr document. The conversion will happen behind the scenes. (And, that relates to its use in html as well as LaTeX.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/harrelfe/Hmisc/issues/124?email_source=notifications&email_token=AAP5ZI7WAOTVTRNM35V525TRDQOKHA5CNFSM4KXFUM5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMC67LY#issuecomment-587591599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAP5ZIYARMRCJJ75BFXOSP3RDQOKHANCNFSM4KXFUM5A .

billdenney commented 4 years ago

latex(..., file="") will do what I was thinking of. I'll try that out.

latexTranslate() does a partial translation of character strings to something latex-like. I'm looking to start from a formula instead of a character string.