spinlud / linkedin-jobs-scraper

147 stars 40 forks source link

Timeout on loading job details #21

Closed stoodkev closed 1 year ago

stoodkev commented 3 years ago

I've tried the Usage example code provided, with and without the cookie and always end up getting timeouts, any idea?

scraper:info Env variable LI_AT_COOKIE detected. Using LoggedInRunStrategy +0ms
  scraper:info Setting chrome launch options {
  headless: true,
  args: [
    '--enable-automation',
    '--start-maximized',
    '--window-size=1472,828',
    '--lang=en-GB',
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-gpu',
    '--disable-accelerated-2d-canvas',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    "--proxy-server='direct://",
    '--proxy-bypass-list=*',
    '--allow-running-insecure-content',
    '--disable-web-security',
    '--disable-client-side-phishing-detection',
    '--disable-notifications',
    '--mute-audio',
    '--lang=en-GB'
  ],
  defaultViewport: null,
  pipe: true,
  slowMo: 100
} +2ms
Listening on port 3001
  scraper:info [Engineer][Europe] Starting new query: query="Engineer" location="Europe" +1s
  scraper:info [Engineer][Europe] Query options {
  locations: [ 'Europe', 'United States' ],
  limit: 33,
  optimize: true,
  filters: { type: [ 'F', 'C' ] }
} +0ms
  scraper:info Setting authentication cookie +2s
  scraper:info [Engineer][Europe] Opening https://www.linkedin.com/jobs/search?keywords=Engineer&location=Europe&f_JT=F%2CC&start=0 +204ms
  scraper:info [Engineer][Europe] Session is valid +3s
  scraper:info [Engineer][Europe] Jobs fetched: 7 +308ms
  scraper:error [Engineer][Europe][1] Timeout on loading job details +0ms
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:error [Engineer][Europe][1] Timeout on loading job details +4s
  scraper:info [Engineer][Europe][1] Pagination requested (2) +30s
spinlud commented 3 years ago

Try to use a greater slowMo value:

const scraper = new LinkedinScraper({
        headless: true,
        slowMo: 250,
});
maharshi66 commented 3 years ago

I have been facing the same issue. Tried it with greater slowMo values but it hasn't helped. When run as an anonymous session, it shows "No Jobs found" and with an authenticated session, shows "Timeout on loading job details"

stoodkev commented 3 years ago

I've fixed the issue for a while using Tor as proxy. Now I ve got the issue again, even by setting a high slowMo (2000), I fetch a few jobs and then it stops working with Too many requests error.

spinlud commented 3 years ago

Linkedin is becoming very aggressive with rate-limiting on anonymous sessions. I suggest to use an authenticated session if you can, limits are much less strict