pbs-assess / csasdown

:book: An R package for creating CSAS reports in PDF or Word format with R Markdown and bookdown
Other
47 stars 17 forks source link

3 Formatting Bugs/Issues/Questions #266

Closed SOLV-Code closed 2 weeks ago

SOLV-Code commented 4 weeks ago

Describe the bug

Running into two issues with table formatting and one issue with handling long figure captions.

Table Issue 1: blank lines/ line breaks

This affects all the tables with text, in the tech report and both English and French Res Docs. All these reports were started a while back, and the tables did not do this. At some point earlier this year the line breaks started showing up, but decided to leave it because didn't prevent all the other components of the edits and French translation. Now there's a tech report and res doc almost ready to submit for publishing, and need to fix it.

Not all text in all tables does it. Haven't been able to figure out a pattern. See examples in the worked examples.

Table Issue 2: In French Res Doc, "continued on next page" shows up in English

See second table in the worked example. This shows up in French in the worked example, but for the same table in the Res Doc (same csasdown install) it shows up in English. Any idea where to fix that without starting a clean Res Doc from scratch and moving everything over bit by bit?

Caption Issues

For a couple of figures the French version of the caption doesn't fit on the page. Is there a good way of handling this? See Fig 1 in the worked example.

In the original Res doc, we just revised plot layout and/or caption text until it fit, but now the French text is much longer, and can't really change the plot or the text at this point

To Reproduce

Full folder with all csasdown files for a reproducible example on Google Drive

Desktop (please complete the following information):

cgrandin commented 3 weeks ago

I don't see any issues with the tables. Here is a screenshot of Tables 1 and 2 from my build using your code. I don't see any "extra line breaks".

Table 1 image

Table 2 image

I also don't see the English "continued on next page" on table 2 when I make it break across pages, although it doesn't really matter as CSAS won't allow that anyway and make you turn it off (I just added the argument show_continued_text to the csas_table() parameter list in a recent commit:

image

Table 3 in French looks fine as well, unless I'm missing something:

image

I know this is obvious but have you installed the latest version of csasdown?

Also if you are not using the verbose output you should try this when building to see what's happening during the build:

csasdown::render(verbose = TRUE)

cgrandin commented 3 weeks ago

If I get a caption that becomes too large, I either change the caption to make it shorter or reduce the size of the plot. For tables I let them break across multiple pages. In some cases making it a landscape page can help: See https://github.com/pbs-assess/csasdown/wiki/How-to-make-landscape-tables

SOLV-Code commented 3 weeks ago

Thank you for your quick response! Good to know that this is not happening when you run it. I will try again to reinstall the latest version. For some reason it has been timing out...

Here's what that first table looks like when I run it: Table1

Hopefully it's just a matter of reinstalling csasdown, rather than the whole tinytex etc. behind-the-scenes stuff. I'll let you know how it goes.

Re: the captions, the issue is only in the French version, because the text is much longer. I don't think I can change the text at this point, the English prepub version already around for a while. I'll try shrinking the figure.

SOLV-Code commented 3 weeks ago

I updated csasdown to v0.1.7 and then it told me to update tinytex, which I did:

Error: LaTeX failed to compile SkeenaNass_Sockeye_BM_ResDoc_FRENCH.tex. See https://yihui.org/tinytex/r/#debugging for debugging tips. See SkeenaNass_Sockeye_BM_ResDoc_FRENCH.log for more info. Execution halted


I found [this issue](https://github.com/pbs-assess/csasdown/issues/257) and that fixed the French knitting problem.

Any ideas what I could try next re: the Table formatting? It's not something with the specific res doc files, because it did the same thing with the worked example
SOLV-Code commented 3 weeks ago

One more thing: that issue with the tables happens exactly the same way on 2 different computers.

cgrandin commented 3 weeks ago

Can you please post the intermediary .tex file so I can do a diff against the one I have produced?

Also, make sure you have updated the knitr and kableExtra packages.

SOLV-Code commented 3 weeks ago

Looking at that file, the issue seems to be an extra \ in many of the rows. The one line in Table 1 that doesn't have a problem (for BLDP) also doesn't have the extra \.

Any idea why the extra \ might show up in some places in some of the tables, but not others (like Table 4)?

\midrule BLDP & Babine Lake Development Project\\
\midrule\\ CU & Conservation Unit\\

The full table 1 code is:


\begingroup\fontsize{10}{12}\selectfont \begingroup\fontsize{10}{12}\selectfont  
\begin{longtable}[t]{>{\raggedright\arraybackslash}p{10em}>{\raggedright\arraybackslash}p{24em}} \caption{\label{tab:TableAcronyms}Formes longue et abrégée des termes techniques utilisés tout au long du document.}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endfirsthead \multicolumn{2}{l}{\textit{... Suite de la page précédente}} \\ \hline \caption*{}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endhead \hline \multicolumn{2}{l}{\textit{Suite à la page suivante ...}} \\ \endfoot \bottomrule \endlastfoot AAH & Aggregate Allowable Harvest\\
\midrule\\ ADF\&G & Alaska Department of Fish and Game\\
\midrule BLDP & Babine Lake Development Project\\
\midrule\\ CU & Conservation Unit\\
\midrule\\ DFO & Fisheries and Oceans Canada, formerly Department of Fisheries and Oceans\\
\midrule\\ FSC & Food, Social, and Ceremonial Fisheries\\
\midrule\\ GSI & Genetic stock identification\\
\midrule\\ HCR & Harvest Control Rule\\
\midrule\\ LHAZ & Life History and Adaptive Zone\\
\midrule\\ NBRR and NBSRR & Northern Boundary Run Reconstruction and Northern Boundary Sockeye Run Reconstruction model\\
\midrule\\ NCCSDB & North and Central Coast Salmon Data Base\\
\midrule\\ NUSEDS & Fisheries and Oceans Canada New Salmon Escapement Database\\
\midrule\\ PR & Photosynthetic Rate\\
\midrule\\ PST & Pacific Salmon Treaty\\
\midrule\\ SR & Spawner-Recruit\\
\midrule\\ SSIRR & Skeena Sockeye In-River Run Reconstruction model\\
\midrule\\ WSP & Wild Salmon Policy\\* \end{longtable}

\endgroup{} \endgroup{}

The full Table 4 code is:


\begingroup\fontsize{9}{11}\selectfont \begingroup\fontsize{9}{11}\selectfont  
\begin{longtable}[t]{lll} \caption{\label{tab:IPPCLikelihoodTab}Échelle de probabilité du GIEC d'après Mastrandrea et al.~(2011) et le code de couleur utilisé dans le présent document.}\\ \toprule Terme & Probabilité & Couleur\\ \midrule \endfirsthead \multicolumn{3}{l}{\textit{... Suite de la page précédente}} \\ \hline \caption*{}\\ \toprule Terme & Probabilité & Couleur\\ \midrule \endhead \hline \multicolumn{3}{l}{\textit{Suite à la page suivante ...}} \\ \endfoot \bottomrule \endlastfoot Pratiquement certain & 99-100 \% & Vert foncé\\
\midrule Très probable & 90-100 \% & Vert foncé\\
\midrule Probable & 66-100 \% & Vert pâle\\
\midrule Presque aussi probable qu/'improbable & 33-66 \% & Blanc\\
\midrule Peu probable & 0-33 \% & Rose clair\\
\midrule Très improbable & 0-10 \% & Rose foncé\\
\midrule Extrêmement improbable & 0-1 \% & Rose foncé\\* \end{longtable}

\endgroup{} \endgroup{}
cgrandin commented 3 weeks ago

The TeX code looks like it came from an old version of csasdown. It has hypertargets in it and those were done away with sometime this past year. Make sure you are cloning and installing from the main branch, the most recent commit (978c35e4381c8b44f7fdf8dca340947bc31dc420)

Here is a diff screen, showing the current (left) and yours (right). image

My guess with the extra slash in the tables is that you are missing a commit which fixed that and a bunch of other commits as well.

This made me realize we need to have git tags and releases for csasdown, as the only indicator of version number is in the description file and that is there for every commit since it was added; there is no one commit where that version is the official one.

SOLV-Code commented 3 weeks ago

Will do, but I just did a re-install yesterday. I'll try to actually clear out csasdown with remove.packages() first. Are there any legacy files (dll or similar?) that I should remove manually as well before doing a clean install? Or any other packages I should fully remove?

seananderson commented 3 weeks ago

Maybe check what version of pandoc is being used? That's what used to write the hypertarget code. There should be code in other issues around checking that.

SOLV-Code commented 3 weeks ago

I'll clear out pandoc and install the latest version. Currently I'm using 2.14 Screen Shot 10-25-24 at 03 54 PM

cgrandin commented 3 weeks ago

Thanks @seananderson - Come to think of it, I think it was the Pandoc change that removed those hypertarget macros. I have 3.2 installed according to the following command (from within R). I think you need at least 2.8, but just install the newest one:

rmarkdown::pandoc_version()

SOLV-Code commented 3 weeks ago

I've now upgraded to pandoc 3.5 as a stand-alone download, and upgraded RStudio to the latest version (because it uses a built-in pandoc now, doesn't it?). Also updated knitr and kableExtra and csasdown. I reinstalled all these packages from within R, because in RStudio some of the dependencies had "permission denied" (e.g., curl, dplyr). I also manually deleted these "permission denied" package folders from C:/Users/username/AppData/Local/R/win-library/4.3. That does the trick when having to update STAN-related packages for samEst.

I'll try again to delete everything, including csasdown from the win-library, and re-install everything. Can you think of any other place where old versions of dependencies might lurk on a windows machine?

If that fails, I'll just switch over to dusting off the docker container approach. That saved the day last time I had a res doc almost ready to submit and suddenly it stopped knitting.

Thank you both for all the suggestions, and more generally for creating and maintaining this package. Sorting out this kind of detail is still way better than manually copy-pasting 60 figures and 40 tables into a word document and re-pasting every time something changes. Here it is an intellectual challenge to track down the issue, the other is just mind-numbing.

SOLV-Code commented 3 weeks ago

Thought maybe the issue is with style file or something that doesn't change when I update the packages, so created a new res doc from scratch after all the updates, and copied in one of the tables, and same thing happens. So weird...

SOLV-Code commented 3 weeks ago

Switched to the docker container. I think i followed the steps exactly to get the latest, and was able to create and render the built-in res doc template.

However: With the actual res doc it starts rendering fine, but then crashes on the across() function from dplyr, as follows:

Quitting from lines 6528-6567 (SkeenaNass_Sockeye_BM_ResDoc_FRENCH.Rmd) 
Error in across(c(mean, sd, p10, p25, p50, p75, p90), ~format(round(.x,  : 
  could not find function "across"
In addition: Warning message:
Missing column names filled in: 'X1' [1] 

Even running library(tidyverse) before rendering doesn't help. When I check the Docker image, it was generated over 3 years ago. I guess the included version of dplyr precedes the switch to across().

Below some print screens showing the setup steps:

Screen Shot 10-26-24 at 02 38 PM

Screen Shot 10-26-24 at 02 38 PM 001

Screen Shot 10-26-24 at 02 39 PM

SOLV-Code commented 3 weeks ago

I tried one more thing: install everything from scratch on a computer that didn't have R, Rstudio, pandoc or tinytex before. The default res doc is rendered fine, but when I add in the testing table, it again introduces the extra \ and has the extra linebreaks. Zip file for that second test is on Google Drive

So it is not some kind of legacy dependency lurking somewhere, and it is not due to the way the table code is set up, because it rendered correctly when you ran it with the same files.

Could it be something about the operating system, or some encoding settings when it reads the csv, or something like that? I vaguely remember some trouble re: "UTF-8 BOM" a while back.

Here's the table code in the Rmd file:


(ref:TableAcronyms) Formes longue et abrégée des termes techniques utilisés tout au long du document. 

```{r TableAcronyms, echo = FALSE, results = "asis"}

table.in <- read_csv("data/Acronyms.csv")# %>% select

table.in[,1:2]  %>%     
   #mutate_all(function(x){x = as.character(x)}) %>%
   #mutate_all(function(x){gsub("-", "", x)}) %>% 
   #mutate_all(function(x){gsub("&", "\\\\&", x)}) %>% 
   mutate_all(function(x){gsub("%", "\\\\%", x)}) %>%
     #mutate_all(function(x){gsub("@", "\\\\@", x)}) %>%
   mutate_all(function(x){gsub("\\\\n","\n", x)}) %>%
   csas_table(format = "latex", escape = TRUE, font_size = 10,align = c("l","l","l"),
                  caption = "(ref:TableAcronyms)")  %>%
    kableExtra::row_spec(1:dim(table.in)[1]-1, hline_after = TRUE) %>%
     kableExtra::column_spec(1, width = "10em") %>%
     kableExtra::column_spec(2, width = "24em") #%>%
   #  kableExtra::column_spec(3, width = "12em")
   #kableExtra::row_spec(c(1:3,5), extra_latex_after = "\\cmidrule(l){2-3}") 

Here's the resulting part of the tex file from _book


\begingroup\fontsize{10}{12}\selectfont \begingroup\fontsize{10}{12}\selectfont  
\begin{longtable}[t]{>{\raggedright\arraybackslash}p{10em}>{\raggedright\arraybackslash}p{24em}} \caption{\label{tab:TableAcronyms}Formes longue et abrégée des termes techniques utilisés tout au long du document.}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endfirsthead \multicolumn{2}{l}{\textit{... Continued from previous page}} \\ \hline \caption*{}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endhead \hline \multicolumn{2}{l}{\textit{Continued on next page ...}} \\ \endfoot \bottomrule \endlastfoot AAH & Aggregate Allowable Harvest\\
\midrule\\ ADF\&G & Alaska Department of Fish and Game\\
\midrule BLDP & Babine Lake Development Project\\
\midrule\\ CU & Conservation Unit\\
\midrule\\ DFO & Fisheries and Oceans Canada, formerly Department of Fisheries and Oceans\\
\midrule\\ FSC & Food, Social, and Ceremonial Fisheries\\
\midrule\\ GSI & Genetic stock identification\\
\midrule\\ HCR & Harvest Control Rule\\
\midrule\\ LHAZ & Life History and Adaptive Zone\\
\midrule\\ NBRR and NBSRR & Northern Boundary Run Reconstruction and Northern Boundary Sockeye Run Reconstruction model\\
\midrule\\ NCCSDB & North and Central Coast Salmon Data Base\\
\midrule\\ NUSEDS & Fisheries and Oceans Canada New Salmon Escapement Database\\
\midrule\\ PR & Photosynthetic Rate\\
\midrule\\ PST & Pacific Salmon Treaty\\
\midrule\\ SR & Spawner-Recruit\\
\midrule\\ SSIRR & Skeena Sockeye In-River Run Reconstruction model\\
\midrule\\ WSP & Wild Salmon Policy\\* \end{longtable}

\endgroup{} \endgroup{}
cgrandin commented 3 weeks ago

Did you check rmarkdown::pandoc_version() after doing all of the new installations just to make sure?

If you think read_csv() and BOM is causing an issue you should check the data frame that you get from that call. You can view it in Rstudio as a nice table by using the kbl() command:

j <- read_csv("data/Acronyms.csv")
j |> kableExtra::kbl()

Mine looks like this:

image

Which is what I would expect.

If you add keep_md to the YAML like this:

output:
 csasdown::resdoc_pdf:
   keep_md: true
   french: true

The compiled MD file will appear in the _book folder. You can double-check the table code in there. It is converted to TeX by Pandoc so if it is ok there, and not in the TeX file, it must be Pandoc causing the issue.

Make sure to double check pandoc version in R with rmarkdown::pandoc_version()

SOLV-Code commented 3 weeks ago

Thank you for these additional suggestions. I will check the compiled MD.

Re: pandoc version: Turns out that may updating everything on 2 computers, and installing everything from scratch on a third, has left me with 3 versions of pandoc: 3.5 on one, 3.1.11 on another, and 3.2 on the third. I will try to get them all up to the latest. So weird.

cgrandin commented 3 weeks ago

I just added to my previous comment this but adding again so you don't miss it:

Make sure to double check pandoc version inside R with rmarkdown::pandoc_version()

SOLV-Code commented 3 weeks ago

Turns out that it can be a bit tricky to make RStudio use the latest version of pandoc, even if you have it installed. Do you have a good way for handling that?

I'm trying to figure out from this thread

SOLV-Code commented 3 weeks ago

Update

pandoc_version() now shows 3.5 in Rstudio on all 3 of the computers I'm testing this on.

The keep_md:true option produces an md file that has the same issue with the extra \\ in some rows for some of the tables. Table code from md file included below. So I guess it's not pandoc after all?

Using @seananderson approach for debugging tex files (as per beginning of this wiki entry), I can fix the tables in the tex file and then render with tinytex::pdflatex()

For now, I'm just compiling a list of variations I can find and replace in the tex file to get these 2 documents out the door for the next round of sign-off (1 tech rep, 1 res doc in Eng and Fr). So far I have:

 \cmidrule\\ -> \cmidrule
\cmidrule(l){2-5}\\ -> \cmidrule(l){2-5}
\cmidrule(l){2-6}\\ -> \cmidrule(l){2-6}

For the longer term, this is not a good solution. Any other suggestions for what I could check or update to fix these extra linebreaks that show up on 3 of my computers but not when you run the exact same inputs?

Table Code from md file


(ref:TableAcronyms) Formes longue et abrégée des termes techniques utilisés tout au long du document. 

\begingroup\fontsize{10}{12}\selectfont \begingroup\fontsize{10}{12}\selectfont  \begin{longtable}[t]{>{\raggedright\arraybackslash}p{10em}>{\raggedright\arraybackslash}p{24em}} \caption{(\#tab:TableAcronyms)(ref:TableAcronyms)}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endfirsthead \multicolumn{2}{l}{\textit{... Continued from previous page}} \\ \hline \caption*{}\\ \toprule Short Form & Expansion\\
\midrule\\ \midrule \endhead \hline \multicolumn{2}{l}{\textit{Continued on next page ...}} \\ \endfoot \bottomrule \endlastfoot AAH & Aggregate Allowable Harvest\\
\midrule\\ ADF\&G & Alaska Department of Fish and Game\\
\midrule BLDP & Babine Lake Development Project\\
\midrule\\ CU & Conservation Unit\\
\midrule\\ DFO & Fisheries and Oceans Canada, formerly Department of Fisheries and Oceans\\
\midrule\\ FSC & Food, Social, and Ceremonial Fisheries\\
\midrule\\ GSI & Genetic stock identification\\
\midrule\\ HCR & Harvest Control Rule\\
\midrule\\ LHAZ & Life History and Adaptive Zone\\
\midrule\\ NBRR and NBSRR & Northern Boundary Run Reconstruction and Northern Boundary Sockeye Run Reconstruction model\\
\midrule\\ NCCSDB & North and Central Coast Salmon Data Base\\
\midrule\\ NUSEDS & Fisheries and Oceans Canada New Salmon Escapement Database\\
\midrule\\ PR & Photosynthetic Rate\\
\midrule\\ PST & Pacific Salmon Treaty\\
\midrule\\ SR & Spawner-Recruit\\
\midrule\\ SSIRR & Skeena Sockeye In-River Run Reconstruction model\\
\midrule\\ WSP & Wild Salmon Policy\\* \end{longtable} \endgroup{} \endgroup{}
cgrandin commented 3 weeks ago

Maybe try using my fork of kableExtra? I changed some things that broke latex tables but they wouldn't allow it to be pulled into the master. I'm running out of possibilities.

remotes::install_github("cgrandin/kableExtra")

cgrandin commented 3 weeks ago

Also this table will probably work if you take all the horizontal lines out as the midrule\\ is the problem: i.e. comment the row_spec line out kableExtra::row_spec(1:dim(table.in)[1]-1, hline_after = TRUE):

table.in <- read_csv("data/Acronyms.csv")# %>% select

table.in[,1:2]  %>%
  #mutate_all(function(x){x = as.character(x)}) %>%
  #mutate_all(function(x){gsub("-", "", x)}) %>%
  #mutate_all(function(x){gsub("&", "\\\\&", x)}) %>%
  mutate_all(function(x){gsub("%", "\\\\%", x)}) %>%
  #mutate_all(function(x){gsub("@", "\\\\@", x)}) %>%
  mutate_all(function(x){gsub("\\\\n","\n", x)}) %>%
  csas_table(format = "latex", escape = TRUE, font_size = 10,align = c("l","l","l"),
             caption = "(ref:TableAcronyms)")  %>%
  #kableExtra::row_spec(1:dim(table.in)[1]-1, hline_after = TRUE) %>%
  kableExtra::column_spec(1, width = "10em") %>%
  kableExtra::column_spec(2, width = "24em") #%>%
cgrandin commented 3 weeks ago

I added a patch to fix the midrule problem, so just pull csasdown and try again. I don't like these kinds of patches because we have no idea why this happened, we are just patching it after-the-fact.

I cannot test it on my machine because I don't have the issue so please let me know if it worked.

SOLV-Code commented 3 weeks ago

Hurray! All the tables come out fine! Thank you so much for your on-going support and trouble-shooting!

When I installed only the csasdown update it crashed during the knit due to some misplaced \noalign problem, but then I did:

remove.packages("kableExtra")
remotes::install_github("cgrandin/kableExtra")

and everything worked seamlessly.

maybe add the kableExtra custom download to the install instructions in the readme?

The other thing that seems to be necessary on one of the computers I'm trying this out on is to do the package update in R rather than RStudio. Don't know why, but in Rstudio some dependencies are locked sometimes, and then csasdown install has non-zero exit status. But if I switch to R, remove those packages with with remove.packages(), and the run the remotes::install_github("pbs-assess/csasdown", dependencies = TRUE) it installs fine. Just putting this here in case somebody searches through the issues for non-zero exit status....

The locked packages were : curl, xfun, yaml, and digest

quang-huynh commented 3 weeks ago

@cgrandin this patch seems to have broken my code. Line 304 should read \midrule\noalign{} but renders as \midrulenoalign{}

Reproducible example also attached in zip file.

image

simple.zip

SOLV-Code commented 3 weeks ago

Did you install the custom modified version of kableExtra?

Here's what made it work for me:

remove.packages("kableExtra")
remotes::install_github("cgrandin/kableExtra")
quang-huynh commented 3 weeks ago

Hmm, yes the issue still persists for me. Fortunately, I can still render my original document by reverting csasdown without the patch

cgrandin commented 3 weeks ago

@quang-huynh I did a small fix which ensures that the midrule double-backslash replacement only occurs when followed by whitespace. This fixes your issue.

@SOLV-Code Please make sure yours still works!

Thanks

SOLV-Code commented 3 weeks ago

updated csasdown and installed the custom kableExtra, and everything worked on the first try . Thank you!