joachim-gassen / sposm

An Open Science Course on Statistical Programming
MIT License
32 stars 43 forks source link

Assignment #2, due Nov 18, 9am: Function to color cells in latex tables #10

Open joachim-gassen opened 4 years ago

joachim-gassen commented 4 years ago

Your next individual assignment is a typical programming task. You have to develop a function that will take

and returns a plain text latex table with the content of the indicated cells colored with the desired colors.

Please do not use any dedicated packages that process latex code. Instead, implement your own algorithm. As always, please provide in-code links to all resources that you used when developing your code.

You do not have to implement all extensions to the latex tabular environment (e. g., the \multicolumn and \multirow commands). You decide what you want to implement and what not.

Try to develop the code so that it handles its input gracefully: It should return meaningful errors to the user and not return unusable output.

Some hints:

Deadline: Nov 18, 2019, 9am

As always: Feel free to discuss your progress in this thread.

NDelventhal commented 4 years ago

Dear Joachim,

it might be very obvious, but I have to ask. Can the input “latex table (in plain text)” be interpreted as plain text interspersed with some LaTeX commands? So could a starting point for this task be the following?

simple_table = """\begin{tabular}{ l c r } 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{tabular}"""

Best regards

joachim-gassen commented 4 years ago

it might be very obvious, but I have to ask. Can the input “latex table (in plain text)” be interpreted as plain text interspersed with some LaTeX commands? So could a starting point for this task be the following?

simple_table = """\begin{tabular}{ l c r } 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{tabular}"""

Exactly! Plain Text only means "no PDF". Thanks for making this clearer.

mpff commented 4 years ago

This might also be an obvious question, but should we color the cells by inserting the \cellcolor{} command from the package xcolor into specified cells?

I.e. let's say I would like to color cell (1,1) of the simple_table proposed above in red. At the moment my idea is to first extract the content of the cells into a nested array:

[['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]

And then add to the content of cell (1,1) like so:

[['\cellcolor{red}1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]

Would that be a correct way of doing this? Can we assume that the user has the xcolor package imported in LaTeX?

joachim-gassen commented 4 years ago

This sounds very reasonable. And, yes, you can assume that the user has the xcolor package imported.

eulersim commented 4 years ago

Hey everyone, Working on the task in R I have a hard time in figuring out how I can make R use certain latex expressions. For example, R is not willing to accept:

tab_end <- "\end{tabular}" 
Error: '\e' is an unrecognized escape in character string starting ""\e" 

I understand that the expression by itself symbolize a different action. But how can I make R accepting it? I tried different things, but nothing worked.

tab_end <- "\ end{tabular}" 
tab_end
## [1] " end{tabular}"
tab_end <- '"end{tabular}'
tab_end
## [1] "\"end{tabular}" 

Does anyone of you have the same problem and a hint or solution for me?

fengzhi22 commented 4 years ago

Hi Simone, A quick solution to your question is this:

> tab_end <- "\\end{tabular}"
> cat(tab_end)
\end{tabular}

(Reference: https://stackoverflow.com/questions/27721008/how-do-i-deal-with-special-characters-like-in-my-regex)

But actually to generate latex table in plain text in R, you do not need to hand-type the latex syntax such as \end{tabular}. In my case, I used the package xtable to translate R matrices or data frames to latex tables. You may want to take a look at the following link: https://cran.r-project.org/web/packages/xtable/vignettes/xtableGallery.pdf

Best, Fengzhi

eulersim commented 4 years ago

Hi Fengzhi, Thank you so much for your quick answer. I think it will help me!

I was also thinking about using the xtable package to translate my matrice in a latex table. But based on the task description ("Please do not use any dedicated packages that process latex code. Instead, implement your own algorithm") I thought we are not allowed to use this package. But it might be that I missunderstood it. Best, Simone

fengzhi22 commented 4 years ago

Hum... maybe you got the point Simone. @joachim-gassen can we use xtable package or must we create the latex table by hand-writing the latex code?

NDelventhal commented 4 years ago

I would assume, as the input for the requested function requires to be a latex table one is free to choose, which prior path to take for deriving a latex table. So if the package is solely used for deriving one and not used for the colourization steps, then I assume given my understanding of the task description the usage of a package should be fine. Having said this, I have not implemented the usage of a package in my solution, but instead I simple copied example latex tables from the URLs Joachim provided under hints.

joachim-gassen commented 4 years ago

Good morning! What I want you to do is to write a function that read latex code containing a table, modify it by coloring selected cells and returning the modified latex code. This function should not use any dedicated packages like xtable. How you create your input latex code is entirely up to you, so you can certainly use xtable to create the input to the function. Does this help?

fengzhi22 commented 4 years ago

Thank you @NDelventhal and @joachim-gassen for the clarification! Now it's clear for me.

joachim-gassen commented 4 years ago

Thank you to those of you who took a shot at the task! I will come back with feedback hopefully towards the end of the week. Also: I will be posting the next assignment soon...

joachim-gassen commented 4 years ago

I just uploaded my "solution" to the code directory. It consists of the file color_latex_table.R and some test code including test latex tables in the test directory.

My general feedback about your solutions: Everybody tried hard and some did really well! You will have noticed that this was a tough assignment. In principle, the tasks look easy:

The hard part is to parse the latex code right. As latex is a relatively flexible language, it is almost impossible to parse latex syntax correctly without implementing a full blown latex parser. So, as indicated in the assignment, you needed to decide which level of flexibility you want to implement. A key issue is to handle parsing errors by stoping execution and informing the user that something went wrong. Some of you did do that reasonably well.

If you are interested in comparing your solution with my solution, I suggest that you start by looking at the test code and using the latex table examples on your code. There are two complex examples using multirow and multicolumn. I tried to implement this in a somewhat robust way but this is beyond the scope of the assignment. The nasty_table.tex also contains some other things designed to break typical implementations. You can test how your code performs on the simpler examples. In principle you should get the simple.texand real_table_simple.tex to work.

My code is far from being perfect. If you want, you can challenge it by submitting a pull request including a test (most likely including a latex table) that breaks my code. I do not promise that I will fix my code but it will help us to understand how code is being improved step by step by unit testing.

I do not plan to provide individual feedback to each of you on your own code. Instead, I encourage you to test your code on my test cases and to post to this issue if you want to discuss a certain aspect of your or my code. Don't be shy!