webis-de / scriptor

Plug-and-play reproducible web analysis.
MIT License
5 stars 2 forks source link

takeNodesSnapshot - page.evaluate: TypeError: JSON.stringify is not a function #12

Closed Querela closed 2 years ago

Querela commented 2 years ago

I'm honestly not sure why it happened but the webpage did not have JSON.stringify defined. Maybe some scraping protection or whatever ... It is reproducible, type JSON.stringify in the browser console dev tools (Firefox) for the target URL and another one.

URL: http://www.zeonic-republic.net/?page_id=363

Stacktrace:

{"name":"scriptor","hostname":"scriptor-with-ceph-0466","pid":1,"level":30,"id":"34790","url":"http://www.zeonic-republic.net/?page_id=363","msg":"snapshot","time":"2022-01-12T19:39:27.443Z","v":0}
{"name":"scriptor","hostname":"scriptor-with-ceph-0466","pid":1,"level":30,"old":{"width":1280,"height":720},"new":{"width":1280,"height":7460},"msg":"pages.adjustViewportToPage","time":"2022-01-12T19:39:30.797Z","v":0}
node:internal/process/promises:246
          triggerUncaughtException(err, true /* fromPromise */);
          ^

page.evaluate: TypeError: JSON.stringify is not a function
    at traverse (eval at evaluate (:3:2389), <anonymous>:53:23)
    at eval (eval at evaluate (:3:2389), <anonymous>:87:5)
    at t.default.evaluate (<anonymous>:3:2412)
    at t.default.<anonymous> (<anonymous>:1:44)
    at takeNodesSnapshot (/scriptor/lib/pages.js:337:36)
    at Object.takeSnapshot (/scriptor/lib/pages.js:258:25)
    at module.exports._processOne (/script/Script-multi.js:115:19)
    at async module.exports.run (/script/Script-multi.js:54:24)
    at async Object.run (/scriptor/lib/scripts.js:64:31)

Solution: Probably ignore it for now since it seems to be the exception (pun intended) than the norm. Otherwise, call JSON.stringify in the nodejs context, not the browser page. Not sure about security issues that might arise. But hosters could theoretically redefine other functions, too. So, for now, I would keep it as is, and just know about this issue.

johanneskiesel commented 2 years ago

Very nice example of crazy stuff that web pages do.

I think you are right though that I could move the JSON.stringify to node. I think that is a leftover from the old web archiver.

johanneskiesel commented 2 years ago

Should be solved with 0.7.0