opengovsg / pdf2md

A PDF to Markdown converter
https://www.npmjs.com/package/@opendocsg/pdf2md
MIT License
195 stars 39 forks source link

Fix fetchStandardFontData warning #67

Closed 10ego closed 1 year ago

10ego commented 1 year ago

Problem

What problem are you trying to solve? What issue does this close?

When parsing a document that attempts to reach a new font file from the standard font path, it throws a warning:

Warning: fetchStandardFontData: failed to fetch file "FoxitSans.pfb" with "UnknownErrorException:  The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".

This warning is thrown for every document that is being parsed with every font file it cannot use so when executed with a --recursive parameter this can flood the terminal. For e.g.:

Warning: fetchStandardFontData: failed to fetch file "FoxitSans.pfb" with "UnknownErrorException: The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".
Warning: fetchStandardFontData: failed to fetch file "FoxitSansBold.pfb" with "UnknownErrorException: The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".
Warning: fetchStandardFontData: failed to fetch file "FoxitSansItalic.pfb" with "UnknownErrorException: The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".
Warning: fetchStandardFontData: failed to fetch file "FoxitSansBoldItalic.pfb" with "UnknownErrorException: The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".

Solution

How did you solve the problem?

Bug Fixes:

This automatically passes in the standardFontDataUrl path with the default path of the pdfjs-dist package. A thought had occurred to pass this in via argument but it's already verbose enough and this path should be standard across most machines using pdf2md.

Before & After Screenshots

BEFORE: Following console warning is thrown Warning: fetchStandardFontData: failed to fetch file "FoxitSans.pfb" with "UnknownErrorException: The standard font "baseUrl" parameter must be specified, ensure that the "standardFontDataUrl" API parameter is provided.".

AFTER: No more console warnings

10ego commented 1 year ago

Sorry to get back on this so late but your update is missing a trailing slash. Unfortunately I couldn't update it to this PR since it's already merged so I opened a new PR #71.