jkuczm / MathematicaCellsToTeX

Convert Mathematica cells to TeX, retaining formatting.
Other
57 stars 10 forks source link

Problem generating TeX for Mathematica Cells that contain internal newlines/returns #4

Closed jschrier closed 8 years ago

jschrier commented 8 years ago

[[ This is a great package—thanks for developing it; it is making a book project of mine much more tractable/better than it would have been ]]

I'm encountering a CellToTeX problem, that occurs when I try to generate TeX for cells that contain equations in which a function has an internal new-line. Mathematica ignores these, and I have included them to try to make the code more readable,, but CellToTeX seems to throw an exception

Let me show you an example:

psi1DPIB[n_, x_, L_] := Sqrt[2/L]*Sin[n*Pi*x/L];
g1 = Plot[
  {psi1DPIB[1, x, L], psi1DPIB[2, x, L]},
  {x, 0, L}, AxesLabel -> {"x", "\[Psi]"}
  ]

(note the newline between the entries within the Plot[] function) throws the following exception

CellsToTeXException::invalid: Following elements of type Boxes are invalid: RowBox[{g1,=,RowBox[{Plot,[,
,RowBox[{RowBox[{<<3>>}],,,
,RowBox[{<<3>>}],,,RowBox[{<<3>>}]}],
,]}]}]. Exception occurred in toInputFormProcessor[{<<1>>}].

and fails to render the text block. On the other hand, afterremoving the internal newlines from inside the Plot[] function,

psi1DPIB[n_, x_, L_] := Sqrt[2/L]*Sin[n*Pi*x/L];
g1 = Plot[{psi1DPIB[1, x, L], psi1DPIB[2, x, L]}, {x, 0, L}, AxesLabel -> {"x", "\[Psi]"} ]

everything works fine.

Is this an easy fix? I am running this on Mathematica 10.3.0.0

jkuczm commented 8 years ago

Thanks for reporting this issue.

It's caused by behavior of MakeExpression, which chokes when there's whitespace next to square bracket that encloses more than one argument. I think it's a bug in MakeExpression and I'll report it to WRI.

In the meantime we can try to find a workaround for your specific case. As can be seen in error message this problem occurs in toInputFormProcessor. Could you give more details on how you're using it? Did you add it manually to "Processor" option, or are you using "Style" -> "Code" (which by default uses this processor)?

If you're deliberately converting to InputForm, be aware that even built-in conversion, done by Cell > Convert To > InputForm removes non-semantic whitespace characters, so manual formatting disappears.

I'm thinking about complete rewrite of toInputFormProcessor, so that it'll preserve manual formatting. It requires writing appropriate box conversions from scratch, so it will take some time.

jschrier commented 8 years ago

Alas, Githhub won't let me attach the Mathematica notebook directly.

To answer your question about usage: I just copy-pasted the "Example of Customization" example from your stackexchange post to generate it. (copy-pasted below for completeness). It seemed like this was the usage scenario I wanted (represent cells as LaTeX whenever possible, and use PDFs for the rest).

To answer the question about workarounds: Could it be possible to have this output problematic cells as PDF instead of throwing an exception on them? For my purposes, even outputting everything as a PDF would probably be fine...

nbObj = SelectedNotebook[]
SetDirectory[NotebookDirectory[nbObj]];
(*Add CellsToTeX`Configuration` to $ContextPath to get easy access to \
all "processors".*)

PrependTo[$ContextPath, "CellsToTeX`Configuration`"];
SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
 NotebookGet[
   nbObj] /. {cell : Cell[_, "Input" | "Code", ___] :> 
    Cell[CellToTeX[cell, "Style" -> "Code"], "Final"], 
   cell : Cell[_, __] :> 
    Cell[CellToTeX[cell, 
      "Processor" -> 
       Composition[trackCellIndexProcessor, mmaCellGraphicsProcessor, 
        exportProcessor, cellLabelProcessor, 
        extractCellOptionsProcessor]], "Final"]}, "TeX", 
 "FullDocument" -> False, "ConversionRules" -> {"Final" -> Identity}]
jkuczm commented 8 years ago

If you don't need to convert input cells to InputForm and you can accept StandardForm, then use:

SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
    NotebookGet[nbObj] /. {
        cell : Cell[_, "Input" | "Code", ___] :> 
            Cell[CellToTeX[cell, "Style" -> "Input"], "Final"], 
        cell : Cell[_, __] :> 
            Cell[CellToTeX[cell, "Processor" -> Composition[
                trackCellIndexProcessor, mmaCellGraphicsProcessor, exportProcessor,
                cellLabelProcessor, extractCellOptionsProcessor
            ]], "Final"]
        },
        "TeX", "FullDocument" -> False, "ConversionRules" -> {"Final" -> Identity}
]

Above I've just changed "Style" -> "Code" to "Style" -> "Input".

If you want to convert everything to PDF, simply remove rule with special handling of "Input" and "Code" cells:

SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
    NotebookGet[nbObj] /. 
        cell : Cell[_, __] :> 
            Cell[CellToTeX[cell, "Processor" -> Composition[
                trackCellIndexProcessor, mmaCellGraphicsProcessor, exportProcessor,
                cellLabelProcessor, extractCellOptionsProcessor
            ]], "Final"]
        ,
        "TeX", "FullDocument" -> False, "ConversionRules" -> {"Final" -> Identity}
]

If you know exactly which cells are problematic you could add a cell tag to them, using Cell > Cell Tags > Add/Remove Cell Tags..., for example something like: convertToPDFTag, then explicitly exclude tagged cells from conversion to TeX, so that they will be converted to PDF:

SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
    NotebookGet[nbObj] /. {
        cell : Cell[_, "Input" | "Code", Except[CellTags -> "convertToPDFTag"] ...] :> 
            Cell[CellToTeX[cell, "Style" -> "Code"], "Final"], 
        cell : Cell[_, __] :> 
            Cell[CellToTeX[cell, "Processor" -> Composition[
                trackCellIndexProcessor, mmaCellGraphicsProcessor, exportProcessor,
                cellLabelProcessor, extractCellOptionsProcessor
            ]], "Final"]
    },
    "TeX", "FullDocument" -> False, "ConversionRules" -> {"Final" -> Identity}
]

General mechanism caching CellsToTeXException["Invalid", "Boxes"] exception and converting cell to PDF can also be written, but it's a bit more complicated.

jkuczm commented 8 years ago

After digging into Mathematica documentation it turned out that this behavior of MakeExpression is documented. Non-semantic elements should be removed before passing boxes to MakeExpression, so this is a bug in CellsToTeX package.

jkuczm commented 8 years ago

@jschrier I've just released version 0.1.3, that should fix this issue. Manually formatted code should no longer cause exceptions, but conversion to InputForm works like built-in conversion i.e. it removes all non-semantic box elements, including manual line breaks.

jschrier commented 8 years ago

@jkuczm Thanks!