postlight / parser

📜 Extract meaningful content from the chaos of a web page
https://reader.postlight.com
Apache License 2.0
5.41k stars 442 forks source link

TypeError: $.html is not a function #560

Open ShokorV8 opened 4 years ago

ShokorV8 commented 4 years ago

Expected Behavior

when calling mercury parser either through CLI or node on certain urls it should successfully parse them

Current Behavior

instead the parser simply throws the following exception stack

TypeError: $.html is not a function
    at _callee$ (/usr/local/lib/node_modules/@postlight/mercury-parser/dist/mercury.js:7554:22)
    at tryCatch (/usr/local/lib/node_modules/@postlight/mercury-parser/node_modules/regenerator-runtime/runtime.js:45:40)
    at Generator.invoke [as _invoke] (/usr/local/lib/node_modules/@postlight/mercury-parser/node_modules/regenerator-runtime/runtime.js:274:22)
    at Generator.prototype.<computed> [as next] (/usr/local/lib/node_modules/@postlight/mercury-parser/node_modules/regenerator-runtime/runtime.js:97:21)
    at asyncGeneratorStep (/usr/local/lib/node_modules/@postlight/mercury-parser/node_modules/@babel/runtime-corejs2/helpers/asyncToGenerator.js:5:24)
    at _next (/usr/local/lib/node_modules/@postlight/mercury-parser/node_modules/@babel/runtime-corejs2/helpers/asyncToGenerator.js:27:9)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)

Steps to Reproduce

simply calling the CLI parser on the following example link can reproduce the issue. example

Detailed Description

Possible Solution

keery commented 3 years ago

Same problem

rothnic commented 3 years ago

I ran into this same issue. I have no solution, but I did notice that it was happening when trying to load the next page and running into 400 response for the next page, rather than a 200. I was able to get it to at least not fail completely when using the undocumented fetchAllPages option.

For example:

Mercury.parse(null, { contentType: 'markdown', fetchAllPages: false })