Open mtmorgan opened 2 years ago
Thank you again. This is exactly where we wanted to go! I think this would be an interesting task for our future Outreachy fellow.
Thanks Martin, @mtmorgan
Here is a working filter that I was able to come up with. The language is a bit unwieldy and I'm a novice :)
function RawInline (raw)
local formula = raw.text:match '\\Rpackage{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`r Biocpkg(' .. formula .. ')`')
end
local formula = raw.text:match '\\Robject{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\Rfunction{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
end
It would probably be helpful to come up with a test Rnw document and corresponding expected Rmd document, with one line per LaTeX 'test' --> corresponding Rmd. I tweaked your code & my code a bit
return {
{
RawInline = function (raw)
local macro = raw.text:match '\\R{}'
if raw.format == 'latex' and macro then
return pandoc.RawInline('markdown', '*R*')
end
local macro = raw.text:match '\\R$'
if raw.format == 'latex' and macro then
return pandoc.RawInline('markdown', '*R*')
end
local formula = raw.text:match '\\Bioconductor{}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '*Bioconductor*')
end
local formula = raw.text:match '\\CRANpkg{([^}]*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`r CRANpkg(' .. formula .. ')`')
end
local formula = raw.text:match '\\Biocpkg{([^}]*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`r Biocpkg(' .. formula .. ')`')
end
local formula = raw.text:match '\\Githubpkg{([^}]*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`r Githubpkg(' .. formula .. ')`')
end
local formula = raw.text:match '\\Rpackage{([^}]*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\Robject{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\Rcode{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\software{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\file{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
local formula = raw.text:match '\\Rfunction{(.*)}'
if raw.format == 'latex' and formula then
return pandoc.RawInline('markdown', '`' .. formula .. '`')
end
end
}
}
to translate
The \R{} programming language
\R\ is a programming language.
The name of one programming language is simply \R.
\Biocpkg{BiocStyle} is a \Bioconductor{} package.
The \CRANpkg{knitr} is used to create markdown vignettes.
Sometimes packages, like \Githubpkg{AnVILAz} are only found on Github.
The \R{} package \Rpackage{foo} is not found in any common repository
\software{samtools} is pretty important in Bioinformatics...
\Robject{mtcars} is a \Rcode{data.frame}.
\Rfunction{data.frame} is a function used to create a \Rcode{data.frame}.
\Rfunction{data.frame()} is a function used to create a \Rcode{data.frame}.
Sometimes inline \R{} code \Rcode{x <-
1 + 1} can span two lines.
to get something that is mostly correct(?)
The *R* programming language
*R* is a programming language.
The name of one programming language is simply *R*.
`r Biocpkg(BiocStyle)` is a *Bioconductor* package.
The `r CRANpkg(knitr)` is used to create markdown vignettes.
Sometimes packages, like `r Githubpkg(AnVILAz)` are only found on
Github.
The *R* package `foo` is not found in any common repository
`samtools` is pretty important in Bioinformatics\...
`mtcars` is a `data.frame`.
`data.frame` is a function used to create a `data.frame`.
`data.frame()` is a function used to create a `data.frame`.
Sometimes inline *R* code `x <-
1 + 1` can span two lines.
As you note, probably there are much better ways of implementing the Lua code, which is highly repetitive now! Also, maybe we could start a Lua repository that might start to follow better practices (than an issue thread!) for Lua development...
@mcarlsn @villafup @BerylKanali It might be that you've noticed things that we repeatedly have to manually edit to get it in the right format. It might good to start documenting that here, so that we can make sure those cases are included. I agree with @mtmorgan that it would be nice to come up with a test .Rnw. Maybe @BerylKanali can help with this given some guidance?
@jwokaty perhaps it makes sense to create a lua
branch and add an inst/lua
directory with progress so far? I've iterated a bit on @LiNk-NY 's work, and things look pretty promising. Definitely @BerylKanali could help with the test Rnw file!
Following on https://github.com/Bioconductor/sweave2rmd/issues/34, This StackOverflow post shows how to write a Lua filter; a set of these might be developed for the BiocStyle macros as a kind of 'meta' resource for this project.
This
would replace the Rnw macro
\R{}
with the markdown_R_
and if in a fileBiocStyle-Rnw-to-Rmd.lua
would be used asThe next macros to tackle are likely
\CRANpkg{<package name>}
and\Biocpkg{<package name>}
which translate to markdown links[<package name>](https://cran.r-project.org/package=<package name>)
and[<package name>](https://bioconductor.org/packages/<package name>
followed by\Rcode{<inline code>}
translated to`<inline code>`
. I think Sweave code chunks<<...>>= ... @
could also be translated automatically