Closed LukeLauterbach closed 1 year ago
CrossLinked does not pull employee names from the organization's LinkedIn page. It relies on the company name being present in the user's title during search scraping. Therefore, users with two current jobs or those that disable their profile visibility may not appear. I dont expect this to account for all missed accounts but may be a contributing factor.
Im not opposed to mentioning the limitation concerns in the README. However, feel it is more of a limitation when attempting to programmatically perform search engine scraping without an API key or proper authorization - impacting more tools than just CrossLinked.
Curious to get your thoughts, and what a potential warning may be in the README?
Closed inactive.
When using CrossLinked on large organizations, it appears to be limited by Google's 300 result limit (source)(source) and Bing's 1000 result limit (source). In practical application, I can't pull more than 300ish results for a single organization, regardless of the number of employees on LinkedIn.
Using The Hershey Company as an example, there are 7,360 potential employees. For this example, a fresh IP address from NordVPN will be used, and relatively high jitter and timeout values were set.
We can run CrossLinked for Hershey:
We can observe that Google only returns 300 results, as expected. I don't know what is going on with Bing, but I killed it after appeared to hang. I'm not sure what is happening here, as 34 isn't anywhere close to Bing's minimum.
If I am correct (and feel free to let me know if I am not), I fully recognize this is likely going to be a very difficult problem to correct. For now, it might be worth mentioning the limitation in the ReadMe.