Open fritzZz opened 7 years ago
It worked twice for me out of a few hundred times - Im assuming maybe we need new user agents?
I'm not sure but need help!
Will post if I find anything out.
I don't understand why you have to start another issue when there are 2 discussing about this already =_=
Linkedin is strict. It identifies bot requests and sends a 404 repsonse
I found using curl to authentic linkedin worked well, also I have been able to pull down profile requests as well, the issue I am running in too is processing the java script so it can be readable in DOMdocument so I can use XPATH to scrape the information. right now I have a bunch of pregmatch trickery going on sorting through json output that comes down. I wrote my script php, its a class object, anyone care to help with it ? I tried using php-phantomjs , it works well unless you hit a redirect or need to use cookies. I am sure with some time and effort it will work.
When I execute the command ./linkedin-scraper https://www.linkedin.com/in/blablabla/ I got this error:
/usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:942:in'
response_read': 999 => -- https://www.linkedin.com/in/blablabla/ (Mechanize::ResponseCodeError) from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:270:in
block in fetch' from /usr/lib/ruby/1.9.1/net/http.rb:1323:inblock (2 levels) in transport_request' from /usr/lib/ruby/1.9.1/net/http.rb:2672:in
reading_body' from /usr/lib/ruby/1.9.1/net/http.rb:1322:inblock in transport_request' from /usr/lib/ruby/1.9.1/net/http.rb:1317:in
catch' from /usr/lib/ruby/1.9.1/net/http.rb:1317:intransport_request' from /usr/lib/ruby/1.9.1/net/http.rb:1294:in
request' from /usr/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:999:inrequest' from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize/http/agent.rb:267:in
fetch' from /usr/lib/ruby/gems/1.9.1/gems/mechanize-2.7.4/lib/mechanize.rb:464:inget' from /home/fritzzz/Downloads/linkedin-scraper-master/lib/linkedin-scraper/profile.rb:34:in
initialize' from ./linkedin-scraper:11:innew' from ./linkedin-scraper:11:in
Anyone of you has the same problem?