Single page document generates extra text

JortvD commented 4 months ago

Current behaviour

When a document is generated for a user downloading it, there is extra text generated in opposed to only using our tag. This breaks filtering functionality for our partners.

Desired behaviour

It should not generate this extra text

Steps to reproduce

Download a single page document from the website Education tab.

Website version

latest

What operating are you seeing the problem on?

No response

What browsers are you seeing the problem on?

No response

Other information

No response

tomudding commented 4 months ago

Filler PDF

tomudding commented 1 week ago

I tested this a lot and even with a filler page (see diff below) I still get the Powered by text. I strongly believe that this is not solvable.

diff --git a/module/Application/src/Service/WatermarkService.php b/module/Application/src/Service/WatermarkService.php
index a71271ca4..149d0d9d8 100644
--- a/module/Application/src/Service/WatermarkService.php
+++ b/module/Application/src/Service/WatermarkService.php
@@ -38,6 +38,7 @@ class WatermarkService
     public function __construct(
         private readonly AuthenticationService $authService,
         private readonly string $remoteAddress,
+        private readonly string $storageConfig,
         private readonly array $watermarkConfig,
     ) {
     }
@@ -54,13 +55,26 @@ class WatermarkService
     ): string {
         $pdf = new Fpdi();
         $pdf->setTitle($fileName);
-        $pages = $pdf->setSourceFile($path);
+        $originalNumberOfPages = $pdf->setSourceFile($path);
         $watermark = $this->getWatermarkText();

-        for ($page = 1; $page <= $pages; $page++) {
+        // If there is only one page, add a second page from our filler.
+        $numberOfPages = 1 === $originalNumberOfPages ? 2 : $originalNumberOfPages;
+
+        for ($page = 1; $page <= $numberOfPages; $page++) {
             // Import a page from the source PDF, this is used to determine all specifications for this specific page,
-            // such as the height, width, and orientation.
-            $templateIndex = $pdf->importPage($page);
+            // such as the height, width, and orientation. If it is the second page and there was originally only one
+            // page, switch to the filler PDF.
+            if (
+                2 === $page
+                && 1 === $originalNumberOfPages
+            ) {
+                $pdf->setSourceFile($this->storageConfig . '/filler.pdf');
+                $templateIndex = $pdf->importPage(1);
+            } else {
+                $templateIndex = $pdf->importPage($page);
+            }
+
             $templateSpecs = $pdf->getTemplateSize($templateIndex);
             $pdf->setPrintHeader(false);
             $pdf->setPrintFooter(false);
@@ -150,12 +164,12 @@ class WatermarkService
         // Construct final PDF.
         $taggedPdf = new Fpdi();
         $taggedPdf->setTitle($fileName);
-        $pages = $taggedPdf->setSourceFile($tempFlatFile);
+        $numberOfPages = $taggedPdf->setSourceFile($tempFlatFile);

         $tag = $this->watermarkConfig['tag'];

-        // We have to copy all pages to this new PDF.
-        for ($page = 1; $page <= $pages; $page++) {
+        // We have to copy all numberOfPages to this new PDF.
+        for ($page = 1; $page <= $numberOfPages; $page++) {
             $templateIndex = $taggedPdf->importPage($page);
             $templateSpecs = $taggedPdf->getTemplateSize($templateIndex);
             $taggedPdf->setPrintHeader(false);

JortvD commented 1 week ago

Sorry, for this late update! The Powered by was not the issue. Apparently for one page documents they would use OCR instead of checking the text content, which makes it quite hard to find white text. They have changed this to always check text (instead of OCR) and so the system now works as intended!

GEWIS / gewisweb