czekan / liscraper

LinkedIn search results scraper
1 stars 0 forks source link

Page Iteration returning first page only #2

Closed workwareusa closed 7 years ago

workwareusa commented 7 years ago

Did a search for Cisco as the keyword as a test, only returns the first page of results. Tested this at command line as well with --pages attribute.

Search Keyword: Cisco Pages requested: 10

2017-06-01 11:53:27 [scrapy.core.engine] INFO: Spider opened 2017-06-01 11:53:27 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2017-06-01 11:53:27 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023 2017-06-01 11:53:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.linkedin.com/uas/login> (referer: None) 2017-06-01 11:53:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.linkedin.com/nhome/?trk=> from <POST https://www.linkedin.com/uas/login-submit> 2017-06-01 11:53:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.linkedin.com/nhome/?trk=> (referer: https://www.linkedin.com/uas/login) 2017-06-01 11:53:29 [linkedin_spider] DEBUG: Successfully logged in. 2017-06-01 11:53:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.linkedin.com/search/results/index/?keywords=cisco> (referer: https://www.linkedin.com/nhome/?trk=) 2017-06-01 11:53:31 [linkedin_spider] DEBUG: Problem with parsing json data. 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Stephen'), ('last_name', u'Tessitore'), ('position', u'Senior Channel Marketing Leader'), ('company', u'Cisco | Partner Enablement Specialist'), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Tabitha'), ('last_name', u'Jones'), ('position', u'Partner Account Manager - Central Florida'), ('company', u'Cisco'), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Robert'), ('last_name', u'White'), ('position', u'Cisco Business Development Manager '), ('company', ''), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Mark'), ('last_name', u'Jacobs'), ('position', u'Vertical Sales Engineer'), ('company', u'Cisco'), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Justin'), ('last_name', u'Eggert'), ('position', u'Partner Marketing Manager'), ('company', u'Cisco'), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'James'), ('last_name', u'Risler'), ('position', u'Manager - Security Content Development'), ('company', u'Cisco Systems'), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Heather'), ('last_name', u'Box'), ('position', u'Cisco Distribution Partner Development-Hospitiality'), ('company', ''), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Andrew'), ('last_name', u'Maxey'), ('position', u'Cloud Solutions Architect'), ('company', ''), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Matt'), ('last_name', u'Lough'), ('position', u'Regional Manager'), ('company', ''), ('city', u'Orange County'), ('country', u'California Area')]) 2017-06-01 11:53:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.linkedin.com/search/results/index/?keywords=cisco> OrderedDict([('first_name', u'Rob'), ('last_name', u'Richute'), ('position', u"Making rhombus' fit in triangle shaped holes"), ('company', ''), ('city', u'Tampa/St. Petersburg'), ('country', u'Florida Area')]) 2017-06-01 11:53:31 [scrapy.core.engine] INFO: Closing spider (finished) 2017-06-01 11:53:31 [scrapy.extensions.feedexport] INFO: Stored csv feed (10 items) in: cisco.csv 2017-06-01 11:53:31 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 2936, 'downloader/request_count': 4, 'downloader/request_method_count/GET': 3, 'downloader/request_method_count/POST': 1, 'downloader/response_bytes': 93955, 'downloader/response_count': 4, 'downloader/response_status_count/200': 3, 'downloader/response_status_count/302': 1, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2017, 6, 1, 15, 53, 31, 153053), 'item_scraped_count': 10, 'log_count/DEBUG': 17, 'log_count/INFO': 8, 'memusage/max': 46800896, 'memusage/startup': 46800896, 'request_depth_max': 2, 'response_received_count': 3, 'scheduler/dequeued': 4, 'scheduler/dequeued/memory': 4, 'scheduler/enqueued': 4, 'scheduler/enqueued/memory': 4, 'start_time': datetime.datetime(2017, 6, 1, 15, 53, 27, 996262)} 2017-06-01 11:53:31 [scrapy.core.engine] INFO: Spider closed (finished)

workwareusa commented 7 years ago

cisco.csv.zip

czekan commented 7 years ago

@workwareusa Can you send me result of pip freeze? What version of liscraper are you using?

workwareusa commented 7 years ago

liscraper==0.2.1

workwareusa commented 7 years ago

I did a git clone yesterday, let me refresh my env. I apologize if this is on my end.

workwareusa commented 7 years ago

I'm closing this issue, it was on my side. Works beautifully thank you.