scrapinghub / portia

Visual scraping for Scrapy
BSD 3-Clause "New" or "Revised" License
9.3k stars 1.4k forks source link

attributes do not load when going to next page (DOMException) #831

Open flip111 opened 6 years ago

flip111 commented 6 years ago

yesterday i installed portia with docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia. When i click on New sample everything seems to work fine (although i can't figure out how to capture the underlying URL when i click on some text). Then when i leave sampling mode and go to page 2 the attributes are not extracted. When i back to page one they are also not extracted, but when i go into sampling mode again it works fine. In the console i see this exception

DOMException
code: 5
columnNumber: 0
data: null
filename: "http://localhost:9001/assets/portia-ui-11e1d9dc368973bc7e46b10ec87357f5.js"
lineNumber: 28
message: "String contains an invalid character"
name: "InvalidCharacterError"
result: 2152923141
stack: "
  setAttribute@http://localhost:9001/assets/portia-ui-11e1d9dc368973bc7e46b10ec87357f5.js:28:28720
  TreeMirror</e.prototype.deserializeNode/<@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23641
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23558
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.deserializeNode@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:23877
  TreeMirror</e.prototype.initialize@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:22:22079
  msgMutation/<@http://localhost:9001/assets/portia-ui-11e1d9dc368973bc7e46b10ec87357f5.js:1:21785
  y@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:18:19240
  b@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:18:19335
  v@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:18:19143
  @http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:13:16894
  invokeWithOnError@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:6:898
  flush@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:6:1352
  flush@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:5:31415
  end@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:5:24689
  s/e._autorun<@http://localhost:9001/assets/vendor-fcedd7a5b5e0126128f06571baf94010.js:5:24113
  "

i formatted the stack for convenience (originally it escaped newlines)