ContentMine / quickscrape

A scraping command line tool for the modern web
MIT License
259 stars 43 forks source link

quickscrape fails on reinstall on MAC OSX; 'xcodebuild' error and #34

Closed petermr closed 9 years ago

petermr commented 9 years ago

An older version of quickscrape failed when running against a new bmc.json file. Decided to reinstall quickscrape using instructions in README.md. install went fine until the following. Assumed this was not fatal, tried to run anyway but it crashed as below

INSTALL>> (clipped until error) ... npm http GET https://registry.npmjs.org/rimraf npm http 304 https://registry.npmjs.org/rimraf npm http 304 https://registry.npmjs.org/mkdirp

contextify@0.1.11 install /usr/local/lib/node_modules/quickscrape/node_modules/thresher/node_modules/jsdom/node_modules/contextify node-gyp rebuild

xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance

xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer

/CommandLineTools' is a command line tools instance

CXX(target) Release/obj.target/contextify/src/contextify.o SOLINK_MODULE(target) Release/contextify.node SOLINK_MODULE(target) Release/contextify.node: Finished /usr/local/bin/quickscrape -> /usr/local/lib/node_modules/quickscrape/bin/quickscrape.js quickscrape@0.3.6 /usr/local/lib/node_modules/quickscrape ├── which@1.0.8 ├── commander@2.2.0 ├── moment@2.9.0 ├── chalk@0.5.1 (ansi-styles@1.1.0, escape-string-regexp@1.0.2, supports-color@0.2.0, has-ansi@0.1.0, strip-ansi@0.3.0) ├── winston@0.7.3 (cycle@1.0.3, stack-trace@0.0.9, eyes@0.1.8, colors@0.6.2, async@0.2.10, pkginfo@0.3.0, request@2.16.6) └── thresher@0.1.1 (underscore-deep-extend@0.0.5, set@1.1.1, eventemitter2@0.4.14, xpath@0.0.6, shelljs@0.3.0, request-progress@0.3.1, lodash@2.4.1, tough-cookie@0.12.1, spooky@0.2.5, request@2.42.0, casperjs@1.1.0-beta3, jsdom-little@0.10.5, download@1.0.7, phantomjs@1.9.13, jsdom@0.11.1)

RUNNING: (bmc.json is actually a copy of mdpi.json which I intended to edit later) ...

localhost:scrapers pm286$ quickscrape -u http://www.trialsjournal.com/content/15/1/481 -s ./bmc.json info: quickscrape launched with... info: - URL: http://www.trialsjournal.com/content/15/1/481 info: - Scraper: ./bmc.json info: - Rate limit: 3 per minute info: - Log level: info info: urls to scrape: 1 info: processing URL: http://www.trialsjournal.com/content/15/1/481

TypeError: Cannot read property 'actions' of null at /usr/local/lib/node_modules/quickscrape/node_modules/thresher/lib/thresher.js:100:16 at Request._callback (/usr/local/lib/node_modules/quickscrape/node_modules/thresher/lib/url.js:89:5) at Request.self.callback (/usr/local/lib/node_modules/quickscrape/node_modules/thresher/node_modules/request/request.js:236:22) at Request.EventEmitter.emit (events.js:98:17) at Request. (/usr/local/lib/node_modules/quickscrape/node_modules/thresher/node_modules/request/request.js:1142:14) at Request.EventEmitter.emit (events.js:117:20) at IncomingMessage. (/usr/local/lib/node_modules/quickscrape/node_modules/thresher/node_modules/request/request.js:1096:12) at IncomingMessage.EventEmitter.emit (events.js:117:20) at _stream_readable.js:919:16 at process._tickCallback (node.js:419:13) localhost:scrapers pm286$

blahah commented 9 years ago

There was a historical issue with OSX Mavericks and a particular version of Google's gyp library (used by Node.js), which meant you had to have the full XCode installed to use it. The possible fixes are: