GiveToken / GiftBox

Repository for Sizzle
0 stars 0 forks source link

[1100] Glassdoor scraper #1109

Open shreydesai opened 8 years ago

shreydesai commented 8 years ago

Version 0.1

Fixes #1100 - implements a Glassdoor feature that uses the official Glassdoor API to retrieve the name, website, and logo of the company. Currently, the website is being used for the token's description, but I can try to implement some text search algorithm later on to find and parse company descriptions on the website.

wogsland commented 8 years ago

Some js tests appear to be failing:

2 failing

1) recruiting-token.js "before all" hook: ReferenceError: Date is not defined at Object.convert (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/EventInit.js:32:24) at MutationEventImpl.EventImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/events/Event-impl.js:8:48) at MutationEventImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/events/MutationEvent-impl.js:5:1) at Object.setup (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/MutationEvent.js:164:17) at Object.createImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/MutationEvent.js:151:10) at DocumentImpl.createEvent (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:564:31) at DocumentImpl.insertBefore (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Node-impl.js:236:18) at DocumentImpl.appendChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Node-impl.js:380:17) at DocumentImpl.appendChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:367:18) at setChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:266:18) at HtmlToDom._parseWithparse5v1 (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:90:7) at HtmlToDom.appendHtmlToDocument (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:47:48) at setInnerHTML (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:52:27) at DocumentImpl.write (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:420:7) at Document.write (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/Document.js:307:51) at Object.exports.jsdom (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom.js:116:21) at processHTML (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom.js:252:26) at Object.exports.env.exports.jsdom.env as env at Context. (/Library/WebServer/Documents/GiftBox/node_modules/mocha-jsdom/index.js:52:22)

2) recruiting-token.js "before all" hook: ReferenceError: Date is not defined at Object.convert (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/EventInit.js:32:24) at MutationEventImpl.EventImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/events/Event-impl.js:8:48) at MutationEventImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/events/MutationEvent-impl.js:5:1) at Object.setup (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/MutationEvent.js:164:17) at Object.createImpl (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/MutationEvent.js:151:10) at DocumentImpl.createEvent (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:564:31) at DocumentImpl.insertBefore (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Node-impl.js:236:18) at DocumentImpl.appendChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Node-impl.js:380:17) at DocumentImpl.appendChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:367:18) at setChild (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:266:18) at HtmlToDom._parseWithparse5v1 (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:90:7) at HtmlToDom.appendHtmlToDocument (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/browser/htmltodom.js:47:48) at setInnerHTML (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:52:27) at DocumentImpl.write (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/nodes/Document-impl.js:420:7) at Document.write (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom/living/generated/Document.js:307:51) at Object.exports.jsdom (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom.js:116:21) at processHTML (/Library/WebServer/Documents/GiftBox/node_modules/jsdom/lib/jsdom.js:252:26) at Object.exports.env.exports.jsdom.env as env at Context. (/Library/WebServer/Documents/GiftBox/node_modules/mocha-jsdom/index.js:52:22)

npm ERR! Darwin 15.5.0 npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "run" "test" npm ERR! node v6.2.2 npm ERR! npm v3.10.3 npm ERR! code ELIFECYCLE npm ERR! Sizzle.IO@0.0.0 test: cd js && ../node_modules/.bin/mocha --require test/bootstrap.js test npm ERR! Exit status 2 npm ERR! npm ERR! Failed at the Sizzle.IO@0.0.0 test script 'cd js && ../node_modules/.bin/mocha --require test/bootstrap.js test'. npm ERR! Make sure you have the latest version of node.js and npm installed. npm ERR! If you do, this is most likely a problem with the Sizzle.IO package, npm ERR! not with npm itself. npm ERR! Tell the author that this fails on your system: npm ERR! cd js && ../node_modules/.bin/mocha --require test/bootstrap.js test npm ERR! You can get information on how to open an issue for this project with: npm ERR! npm bugs Sizzle.IO npm ERR! Or if that isn't available, you can get their info via: npm ERR! npm owner ls Sizzle.IO npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request: npm ERR! /Library/WebServer/Documents/GiftBox/npm-debug.log

wogsland commented 8 years ago

The images found by the scraper are not replacing the old images when saved.

shreydesai commented 8 years ago

Version 0.2

wogsland commented 8 years ago

1). Still not replacing existing images. 2). Values, videos & social media not being replaced. 3). PHP unit tests failing:

There were 2 failures:

1) Sizzle\Tests\Ajax\LinkedInScraperTest::testAjaxSuccess
Failed asserting that false is true.

/Library/WebServer/Documents/GiftBox/src/Tests/Ajax/LinkedInScraperTest.php:45

2) Sizzle\Tests\Ajax\LinkedInScraperTest::testAjaxFailureURL
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-''
+'null'

/Library/WebServer/Documents/GiftBox/src/Tests/Ajax/LinkedInScraperTest.php:73

FAILURES!
Tests: 199, Assertions: 1424, Failures: 2, Incomplete: 16.