spinlud / linkedin-jobs-scraper

147 stars 40 forks source link

Apply link is not working #32

Closed calvinomiguel closed 2 years ago

calvinomiguel commented 2 years ago

The apply link isn't working properly. I am getting no value for apply link, even for jobs that have an apply link.

calvinomiguel commented 2 years ago

Okay, I tried to dig in into code myself. Not sure if Linkedin is using different versions depending on the location of the user, but on my end the apply button is a real button and not an link built with an a tag. This is the component:

 <button aria-label="Label" id="someId" class="jobs-apply-button artdeco-button artdeco-button--icon-right artdeco-button--3 artdeco-button--primary ember-view" role="link">
<li-icon aria-hidden="true" type="link-external" class="artdeco-button__icon" size="small">
<svg></svg>
</li-icon>
<span class="artdeco-button__text">
    Apply now
</span>
</button>

The solution for me would be:

Tell puppeteer to click the button If button isn't an Easy Apply button Get the get URL from the new opened tab and set applyLink value equal to fetched URL, Close tab Else set appyLink value to "n/a"

soloviola commented 2 years ago

Hey @calvinomiguel, I am interested in having the applyLink value. do you mind to share how you implement it? maybe create a feature branch?

Thank you

calvinomiguel commented 2 years ago

Hey @calvinomiguel, I am interested in having the applyLink value. do you mind to share how you implement it? maybe create a feature branch?

Thank you

Hey, I haven't solved it yet. I usually write my code in plain vanilla js. But this code is in typescript. Although based on the comments and syntax I understand what's going on in the code, I am still having a hard time implementing my own solution. I need help from the original creator.

soloviola commented 2 years ago

ok. I will also try it. You can always use any type to bypass the type check in ts fyi.

calvinomiguel commented 2 years ago

ok. I will also try it. You can always use any type to bypass the type check in ts fyi.

Okay thanks! I'll write here in case I get it working. Would be great if you do the same.

calvinomiguel commented 2 years ago

@spinlud do you have a way of fixing this?

calvinomiguel commented 2 years ago

ok. I will also try it. You can always use any type to bypass the type check in ts fyi.

I've found a solution, but I don't know how to implement it into the existing code as I have no clue of typescript. However, I've created a separated project to simulate my solution. Now we just need to take my bit and somehow integrate it into the existing code.

So basically what we should is the following:

We should get applyButton. We can do so by getting the class jobs-apply-button. After doing so, before we get into the business of getting the URL, we need to make sure, that the button contains the role attribute which contains the value of link. If that's true then click the button. By clicking the headless browser will open a new tab, or new page so to say in puppeteers terms. Now we can use the pages() method to get an array of all pages, respectively tabs that are open in the browser.

let pages = await browser.pages();

pages now contains an array of objects, that represents each of the open pages. Since we just opened a new page, by clicking the aforementioned button, we need now to access the latest item in the pages array. The last item, will most probably be on index 2. However if you want to be sure, use this to dynamically get the last item.

let index = pages.length - 1;

Now that we have the index, in order to get the URL of that page, we just do the following let applyLink = pages[2]._target._targetInfo.url;

And that's basically it. But honestly, no clue on how to properly integrate it in those typescript files 😂😂😂 I think I stand a better chance, by starting everything from scratch and doing it in pure vanilla JS.

If one of you updates it, please let me know.

spinlud commented 2 years ago

Hi guys, I had a look into this and found there are two type of apply links:

Next library version (which is already published) is going to support only the standard apply, since the easy apply requires to automatically fill fields in a modal which is beyond the scope of this library. To extract the apply url, the browser needs to click and navigate the apply link button, wait for the new page to open and extract the url. This is a slow operation so I have decided to enable it as a query option:

const query = {
        query: "Engineer",
        options: {
            locations: ['United States'],
            limit: 10,
            applyLink: true, // default to false
        },
    };

Let me know if this works for you 🍺

calvinomiguel commented 2 years ago

Even though it first gets the user to linkedin and then redirects, it works. A better approach would really to have the external link and not the link that redirects to the external link.