spinlud / py-linkedin-jobs-scraper

MIT License
357 stars 99 forks source link

Company filter only returns 25 job search results #7

Open sirongwang opened 3 years ago

sirongwang commented 3 years ago

I tried the company filter. It works well for the most part, but there seems to be an issue with the program - It only returns the first 25 jobs while there are hundreds on the website. I added something like "start=50" to the URL but it came back with the same results.

image

spinlud commented 3 years ago

Hi there, are you running anonymous or authenticated?

Mind that for both modes, you have to find a slow_mo value which doesn't cause your application to receive 429 Too many requests from the server. That will likely cause failure of jobs loading and pagination. For anonymous mode, I suggest a slow_mo value no less than 1.3 while for authenticated session a value of at least 0.4 (it largely depends on Linkedin rate-limiting settings for the time being).

This is my attempt in anonymous mode:

scraper = LinkedinScraper(
    headless=False,
    max_workers=1,
    slow_mo=1.4,
)

query = Query(
    options=QueryOptions(
        locations=['Worldwide'],
        limit=500,
        filters=QueryFilters(
            company_jobs_url='https://www.linkedin.com/jobs/search/?f_C=25170579&geoId=92000000'
        )
    )
)

Screenshot 2021-03-28 at 18 03 12

This is the code for authenticated mode:

scraper = LinkedinScraper(
    headless=False,
    max_workers=1,
    slow_mo=0.4,
)

query = Query(
    options=QueryOptions(
        locations=['Worldwide'],
        limit=500,
        filters=QueryFilters(
            company_jobs_url='https://www.linkedin.com/jobs/search/?f_C=25170579&geoId=92000000'
        )
    )
)

Screenshot 2021-03-28 at 17 55 35

Let me know if this solves your issue. Cheers!

spinlud commented 3 years ago

Ok! So is your issue solved?

PARODBE commented 2 years ago

Hi!

In my case, when I do searchs by company with your code, I don't understand the reason why the connexion is only in anonymous mode...