Consent PDF export truncates at 1 page

vishnuravi commented 3 weeks ago

Description

The PDF export of the consent form is truncated if the text exceeds 1 page.

Reproduction

Add a consent document to the consent step in SpeziOnboarding with a markdown file containing text that exceeds what can be rendered on 1 US Letter page.
Sign and export the form, or view the PDF stored in Cloud Storage after the consent step completes.

A reproducible example can be seen in the LifeSpace StrokeCog study application. See the comment below for a PDF produced by this application.

Expected behavior

The PDF export is expected to contain all of the text in the markdown file provided, followed by the signature.

Additional context

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct and Contributing Guidelines

vishnuravi commented 3 weeks ago

Example of current PDF export from SpeziOnboarding: 123456_2024-06-07_114422.pdf

Example of expected PDF export (generated by ResearchKit): consent.pdf

philippzagar commented 2 weeks ago

Thanks for creating the issue @vishnuravi! Sadly, the SwiftUI ImageRenderer is somewhat limited in its ability to split long view elements into multiple pages, therefore needing some advanced page splitting logic within Spezi (or even go with another approach?)

RealLast commented 1 week ago

I spent quite some time working on a solution for this. I came up with something that works, but I am not sure if it covers all edge cases (see code below)

In short, I added some logic for manual pagination, creating individual PDF pages if the text overflows to the next page. I tested with one, two, and three pages and it worked well. I attached some PDF examples at the end of this comment. The crucial part in the code is the split function, which took me some tries to get right :D

If you think this can be a suitable solution, I will be happy to clean up the code and do a PR :) @philippzagar Would appreciate your input on this

On a side note, I also tried different approaches and got some findings I think are worth sharing: An alternative to generating the PDF would be to use a library like Ink to convert the markdown text to HTML code, and then use a WebView to render the PDF. However, WebView's createPDF() function also does not include automatic pagination but instead puts all the text in one big PDF file. It might be possible, however, to split that PDF file into smaller individual pages. I did not pursue this approach further.

Here is my current solution. You can also check out the complete code in my forked repo

@MainActor
func export() async -> PDFDocument? 
{
    let markdown = await asyncMarkdown()

    let markdownString = (try? AttributedString(
        markdown: markdown,
        options: .init(interpretedSyntax: .inlineOnlyPreservingWhitespace)
    )) ?? AttributedString(String(localized: "MARKDOWN_LOADING_ERROR", bundle: .module))

    let pageSize = CGSize(
        width: exportConfiguration.paperSize.dimensions.width,
        height: exportConfiguration.paperSize.dimensions.height
    )

    let pages = paginatedViews(markdown: markdownString)

    print("NumPages: \(pages.count)")
    return await withCheckedContinuation { continuation in
        guard let mutableData = CFDataCreateMutable(kCFAllocatorDefault, 0),
              let consumer = CGDataConsumer(data: mutableData),
              let pdf = CGContext(consumer: consumer, mediaBox: nil, nil) else {
            continuation.resume(returning: nil)
            return
        }

        for page in pages {
            pdf.beginPDFPage(nil)

            let hostingController = UIHostingController(rootView: page)
             hostingController.view.frame = CGRect(origin: .zero, size: pageSize)

             let renderer = UIGraphicsImageRenderer(bounds: hostingController.view.bounds)
             let image = renderer.image { ctx in
                 hostingController.view.drawHierarchy(in: hostingController.view.bounds, afterScreenUpdates: true)
             }

            // Correct text being rendered 180° rotated due to coordinate system mismatch.
            pdf.saveGState()
            pdf.translateBy(x: 0, y: pageSize.height)
            pdf.scaleBy(x: 1.0, y: -1.0)
            hostingController.view.layer.render(in: pdf)
            pdf.restoreGState()            
            pdf.endPDFPage()
        }

        pdf.closePDF()
        continuation.resume(returning: PDFDocument(data: mutableData as Data))
    }
}

private func paginatedViews(markdown: AttributedString) -> [AnyView] 
{
    var pages = [AnyView]()
    var remainingMarkdown = markdown
    let pageSize = CGSize(width: exportConfiguration.paperSize.dimensions.width, height: exportConfiguration.paperSize.dimensions.height)
    let headerHeight: CGFloat = 150
    let footerHeight: CGFloat = 150

    while !remainingMarkdown.unicodeScalars.isEmpty {
        let (currentPageContent, nextPageContent) = split(markdown: remainingMarkdown, pageSize: pageSize, headerHeight: headerHeight, footerHeight: footerHeight)

        let currentPage: AnyView = AnyView(
            VStack {
                if pages.isEmpty {  // First page
                    OnboardingTitleView(title: exportConfiguration.consentTitle)
                }

                Text(currentPageContent)
                    .padding()

                Spacer()

                if nextPageContent.unicodeScalars.isEmpty {  // Last page
                    ZStack(alignment: .bottomLeading) {
                        SignatureViewBackground(name: name, backgroundColor: .clear)

                        #if !os(macOS)
                        Image(uiImage: blackInkSignatureImage)
                        #else
                        Text(signature)
                            .padding(.bottom, 32)
                            .padding(.leading, 46)
                            .font(.custom("Snell Roundhand", size: 24))
                        #endif
                    }
                    .padding(.bottom, footerHeight)
                }
            }
            .frame(width: pageSize.width, height: pageSize.height)
        )

        pages.append(currentPage)
        remainingMarkdown = nextPageContent
    }

    return pages
}

private func split(markdown: AttributedString, pageSize: CGSize, headerHeight: CGFloat, footerHeight: CGFloat) -> (AttributedString, AttributedString) 
{
    let contentHeight = pageSize.height - headerHeight - footerHeight
    var currentPage = AttributedString()
    var remaining = markdown

    let textStorage = NSTextStorage(attributedString: NSAttributedString(markdown))
    let layoutManager = NSLayoutManager()
    let textContainer = NSTextContainer(size: CGSize(width: pageSize.width, height: contentHeight))
    layoutManager.addTextContainer(textContainer)
    textStorage.addLayoutManager(layoutManager)

    var accumulatedHeight: CGFloat = 0       
    let maximumRange = layoutManager.glyphRange(for: textContainer)

    currentPage = AttributedString(textStorage.attributedSubstring(from: maximumRange))
    remaining = AttributedString(textStorage.attributedSubstring(from: NSRange(location: maximumRange.length, length: textStorage.length - maximumRange.length)))

    return (currentPage, remaining)
}

And here are some successful examples:

1page.pdf 2pages.pdf 3pages.pdf

ConsentDocument.md

philippzagar commented 1 week ago

I spent quite some time working on a solution for this. I came up with something that works, but I am not sure if it covers all edge cases (see code below)

In short, I added some logic for manual pagination, creating individual PDF pages if the text overflows to the next page. I tested with one, two, and three pages and it worked well. I attached some PDF examples at the end of this comment. The crucial part in the code is the split function, which took me some tries to get right :D

Thanks a lot @RealLast for the deep-dive into that topic! 🚀 I already envisioned a similar pagination approach, that seems like the only good option to me (while keeping the ImageRenderer in place including the current setup). @PSchmiedmayer will take a closer look in the next few days!

RealLast commented 1 week ago

Thanks Philipp! I just created a PR for this. Also, I have further adjusted the algorithm to cover an additional edge case I found and added some comments to the code explaining how it works and how it could be further optimized.

StanfordSpezi / SpeziOnboarding