rstudio / pagedown

Paginate the HTML Output of R Markdown with CSS for Print
https://pagedown.rbind.io
Other
889 stars 129 forks source link

Overflowing content of table cells get placed in the wrong column on the next page #299

Open ulyngs opened 2 years ago

ulyngs commented 2 years ago

I'm running into this problem in my R Markdown template for making academic CVs with pagedown, and I can't figure out what's causing it or how to solve it. Here's a minimal reproducible example:

Create an R Markdown file with this content:

---
output: pagedown::html_paged
---

```{r, echo=FALSE, message=FALSE}
library(dplyr)

tibble(
  id = 1:100,
  content = "here's a line<br>and here's another line"
) |> 
  knitr::kable()


When a page break divides up the content in second-column cells, then the remaining cell content is erroneously placed in the first column instead of in the second column:

<img width="648" alt="image" src="https://user-images.githubusercontent.com/23137032/193276551-7b4a314d-b2e9-4e3c-b5b9-cb6d1a46ddb7.png">

Any idea about how to fix this?
ulyngs commented 2 years ago

Hmmm it seems it's actually not about the break tags, I get the same problem with regular content that overflows between pages:

image
---
output: pagedown::html_paged
---

```{r, echo=FALSE, message=FALSE}
library(dplyr)

tibble(
  id = 1:100,
  content = "here's a line and here's another line and another one and some more amazing content and yet more and more and more"
) |> 
  knitr::kable()
xvazquezc commented 1 year ago

Same issue here. Any word from the devs?

courtney-bryce-hilton commented 1 year ago

+1 am having the same issue, and I can confirm that it is not specific to break tags. At least in my case, it happens whenever a row has to be split across pages. I have this problem and can reproduce the above example with 0.19 pagedown (and have also tried older versions to the same effect).

courtney-bryce-hilton commented 1 year ago

For anybody needing a temporary hacky solution, I got it working for me by manually creating additional tables to ensure they never cross a page.

ulyngs commented 1 year ago

The workaround I use is to manually add a page break at the end of the content in the last cell that fits on a page. I do this by linking a style sheet with a style like this:

styles.css

br.pageBreak { page-break-after: always; }

Then I insert a page break by ending <br class="pageBreak"> at the end of content in the cell that's the last one to fit on the page.

In pagedownCV I added a convenience function to insert this, but it still has to be applied manually -- I don't know a way to automatically detect whether a table will be broken apart

ulyngs commented 1 year ago

@cderv any chance you know some magical way we might go around fixing this? :)

cderv commented 1 year ago

No I don't have a magical way sorry. First thing would need to check somehow if the error is still present with newer version of paged.js.

Unfortunately, I can't spend much time on that, but happy to review any PR is anyone found how to fix that, even a temporary workaround to add. We'll then merge and do a release with pleasure.

Sorry for the inconvenience currently.

ethanhurwitz commented 1 year ago

+1, issue is still occurring.

meefen commented 1 year ago

+1. Is there a fix?

ulyngs commented 1 month ago

@cderv apologies for bumping this --- might you be able to provide some pointer about what I'd need to do check if the issue is present with the latest version of paged.js?

Does pagedown simply include paged.js directly, such that all I need to do is to go directly to https://pagedjs.org and make a reprex with their latest source files?

Or is pagedown doing some magic with it when including, so that I'd need to somehow change the version included within pagedown?

For what it's worth, this bug is actually the single stumbling block that's holding me back from using pagedown widely in my own work --- whenever I need to include tables, I know that I unfortunately can't use pagedown because it has a high likelihood of breaking. 😢

yihui commented 1 month ago

@ulyngs Christophe is currently on vacation. My short answer is that unfortunately, upgrading paged.js may not be straightforward. Please see #252. It seems we need to at least replace our footnote hacks #21 with paged.js's native footnote support, but I don't have much expertise on this (yet). That said, you can definitely try to upgrade paged.js and see if it just works. Actually I briefly tested the latest release of paged.js last week and ran into a bug, so I'd recommend that you try its latest beta release instead (which is currently https://unpkg.com/pagedjs@0.5.0-beta.1/dist/paged.polyfill.js).

yihui commented 1 month ago

Just FYI, I just spent a week on writing a significantly simplified version of paged.js, tentatively named pages.js. I have felt for a long time that paged.js is too sophisticated and also complicated in terms of fragmenting HTML content, and I've been wondering how well we could do if we simply avoid fragmenting individual HTML elements, e.g., don't split a paragraph at the bottom of a page into two pieces to fit two pages. With 180 lines of JS and 185 lines of CSS, I feel an HTML page with a linear structure (i.e., no or few nested elements) can be split into pages reasonably well. You can try to open https://yihui.org/litedown/ and press p to see an example. Of course, my rudimentary fragmentation "algorithm" will leave some white space at the bottom of pages, which is its major drawback, but personally I don't care much about the extra space.

When an element is longer than one page, pages.js will allocate multiple pages (depending on the height of the element) for it when printing to PDF, and I rely on the web browser's built-in fragmentation capability (which is quite amazing) to break the element.

Currently I've only tested it with litedown documents (which produce highly linear HTML output). In case you want to quickly test it:

install.packages("litedown", repos = c("https://yihui.r-universe.dev", "https://cloud.r-project.org"))

Press the Knit button in RStudio or call litedown::fuse() to render the example:

---
output:
  litedown::html_format:
    meta:
      css: ["default", "@pages"]
      js: ["@pages"]
knit: litedown:::knit
---

```{r, echo=FALSE}
I(data.frame(
  id = 1:100,
  content = "here's a line<br>and here's another line"
))


Then press `p` on the HTML output page (or if the page is opened in a browser, you can also try to print the page to PDF with `Cmd/Ctrl + P`, and the document will be automatically split into pages).