akramaznakour / linkedin-scraper

Enhanced LinkedIn Job Search Chrome Extension
https://www.producthunt.com/products/linkedin-job-search-extension
MIT License
31 stars 7 forks source link

Issue and solution: the number of items in each page are sometimes not correct #5

Open WalterJX opened 1 year ago

WalterJX commented 1 year ago

Issue: When the number of search result are huge, e.g. 500 results, usually each page (except the last page) will contain 25 items. However sometimes the scraper found out less than 25 items.

Root cause: This issue is caused by the "scroll down" operation at each page did not go to the very bottom of that page. This is because the HTML changes dynamically and load more items when you scroll down. In the code "endScrollTop " is calculated in the first place, the value is no longer valid when you scroll down. This is the line of code which I talked about: const endScrollTop = element.scrollHeight - element.clientHeight; (content/linkedin-scraper.js, line 18)

Suggest Solution: Move the above line of code into the start of the scrollStep(timestamp) function. So that the "endScrollTop" will be calculated every time in a scroll down operation. You might also want to increase the time out and duration when calling the simulateRealScrollToEnd() method.

(Forgive me by giving word suggestions instead of code, I created a fork and added some of my own features, and because I am not good at javascript, my code looks pretty mess)

akramaznakour commented 1 year ago

Hi @WalterJX Thank you for reporting this issue and providing your insights, I will check out your fork for inspiration and fix this issue as soon as I find the time

SamMrach commented 1 year ago

@akramaznakour please assign me this, i can fix it.

akramaznakour commented 1 year ago

Hi @SamMrach Thanks a lot for the help, I assigned you the issue