Add BEGIN event. Set the limit to 'All jobs' in case of 0 as value

nibomed commented 10 months ago

I have added a feature to scrape the total number of jobs and included this number in a new BEGIN event. If the limit is set to zero, it will now default to the total job count. I have also updated the README to reflect these changes.

While this number may not strictly indicate how many jobs are available for scraping, it can provide an estimate of how long the entire process might take. It’s worth noting that currently, LinkedIn does not provide more than 1000 jobs per query (no jobs or irrelevant jobs appear after skip=1000).

As I am not a Python expert, any feedback or criticism is welcome.

spinlud commented 8 months ago

Hi there! I am not sure I understand what is the purpose of the BEGIN event. How do you plan to use it?

Btw I am ok for adding the option to have limit=0 -> all jobs 😉

nibomed commented 8 months ago

Firstly, upon reviewing the diff, I noticed that more commits than intended were included in this PR. My apologies for the oversight. I can close this PR and split the changes into a few different PRs.

As for the begin event, the main purpose is to show how many jobs will be parsed beforehand. I see at least two useful applications:

It allows for an estimation of the time required for parsing.
It helps handle unusual behavior from LinkedIn. For instance, currently, LinkedIn returns no jobs after https://www.linkedin.com/jobs/search/....&start=1000, even if the query yields more than 1000 jobs. In this case, the user (me) can react to this before parsing happens.

spinlud / py-linkedin-jobs-scraper

Add BEGIN event. Set the limit to 'All jobs' in case of 0 as value #69