ffalt / pdf.js-extract

nodejs lib for extracting data from PDF files
Other
201 stars 52 forks source link

It does not work for remote pdf files? ENOENT error #51

Open DavideViolante opened 4 months ago

DavideViolante commented 4 months ago
  import { PDFExtract, PDFExtractOptions } from 'pdf.js-extract';
  const pdfExtract = new PDFExtract();
  const options: PDFExtractOptions = { };
  async function main() {
    const res = await pdfExtract.extract('https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf', options);
  }
  main()
[Error: ENOENT: no such file or directory, open 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: 'https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf'
}

Node v20 "pdf.js-extract": "^0.2.1"

Am I missing something?

I also tried, but same error:

import { readFile } from 'fs/promises';
import { PDFExtract, PDFExtractOptions } from 'pdf.js-extract';
const buffer = await readFile('https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf');
pdfExtract.extractBuffer(buffer, options, (err, data) => {
  if (err) {
    return console.error(err);
  }
  console.log(data);
});

Does it work only with local files?

DavideViolante commented 4 months ago

Like this I don't get anything on console.log, neither error.

  const { data } = await axios.get('https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf', {
    responseType: 'arraybuffer',
  });
  pdfExtract.extractBuffer(data, options, (err, pdf) => {
    if (err) {
      return console.error(err);
    }
    console.log(pdf);
  });