tomquirk / linkedin-api

👨‍💼 LinkedIn API for Python
https://pypi.org/project/linkedin-api
MIT License
2.01k stars 440 forks source link

get all posts from profile #117

Closed paper2code-bot closed 3 years ago

paper2code-bot commented 4 years ago

Hi guys,

Hope you are all well !

I was wondering if I can fetch the list of all posts (for eg, https://www.linkedin.com/in/philipvollet/detail/recent-activity/shares) with linkedin-api ?

Thanks for your inputs and insights on that.

Cheers, X

tomquirk commented 4 years ago

@paper2code-bot thanks for creating the issue! I'll definitely look at adding this.

P.s. Come join us in our Slack community to discuss new features! https://join.slack.com/t/linkedinapi/shared_invite/zt-h7tny6r6-kQILMuCX3Zwjecnz0IbYCQ

BobCashStory commented 3 years ago

i would like the same

alvaroserrrano commented 3 years ago

Any updates on it?

abinpaul1 commented 3 years ago

This endpoint is a little tricky The GET request we need to look at is

GET /voyager/api/identity/profileUpdatesV2?count=10&includeLongTermHistory=true&moduleKey=member-shares%3Aphone&profileUrn=urn%3Ali%3Afsd_profile%3AACoAAAaIVpEB4s1ycBZHbK2o6ieWGJYmqHzBZUY&q=memberShareFeed&start=0

The query string parameters we would need to modify are count, start for a given profile. The first request would be as above. The response JSON would contain a paginationToken. In order to get the next set of posts that is posts from 10 to 20, we would have to include the returned paginationToken in the subsequent request. So the next request would look like

GET /voyager/api/identity/profileUpdatesV2?count=10&includeLongTermHistory=true&moduleKey=member-shares%3Aphone&paginationToken=dXJuOmxpOmFjdGl2aXR5OjY4NDE2MTE3Nzk3ODk5Nzk2NDgtMTYzMTE2NzM1OTI4OA%3D%3D&profileUrn=urn%3Ali%3Afsd_profile%3AACoAAAaIVpEB4s1ycBZHbK2o6ieWGJYmqHzBZUY&q=memberShareFeed&start=10
alvaroserrrano commented 3 years ago

As far as the profileUrn parameter goes, is it the encoded profile_urn of whatever user we want, is it a fixed random string that works for all requests?

Basically, my question is, what information do I need to extract from the first request in order to make the second request work? Is it just the paginationToken?

Thanks

abinpaul1 commented 3 years ago

Yes. profileUrn parameter is the profile_urn of whatever user we want. The second request requires only pagination token ( from response of first request) as an extra parameter and also the incremented start value. The third request would then require the pagination token from second request and so on. Let me know if it works

alvaroserrrano commented 3 years ago

Thanks for your help. I cannot quite get it to work. Here is what I am trying in order to simply get the posts on the second page.

user = "" # some linkedin user
profile = api.get_profile(user)
posts = api.get_profile_posts(user)
if posts and posts["metadata"]["paginationToken"]:
    url_params = {}
    pagination_token = posts["metadata"]["paginationToken"]
    url_params["paginationToken"] = pagination_token
    profile_urn = profile["profile_urn"].replace(
        "fs_miniProfile", "fsd_profile"
    )
    url_params["profileUrn"] = profile_urn
    tokenized_url = f"{API_BASE_URL}/identity/profileUpdatedV2?count=10&includeLongTermHistory=true&moduleKey=member-shares%3Aphone&{urllib.parse.urlencode(url_params)}&q=memberShareFeed&start=10"
    res = session.get(tokenized_url)
    print(res.status_code)
    print(res.text)
    data = res.json()
    # print(data)

which prints the following output basically telling that it cannot parse res.json():

999
<html><head>
<script type="text/javascript">
window.onload = function() {
  // Parse the tracking code from cookies.
  var trk = "bf";
  var trkInfo = "bf";
  var cookies = document.cookie.split("; ");
  for (var i = 0; i < cookies.length; ++i) {
    if ((cookies[i].indexOf("trkCode=") == 0) && (cookies[i].length > 8)) {
      trk = cookies[i].substring(8);
    }
    else if ((cookies[i].indexOf("trkInfo=") == 0) && (cookies[i].length > 8)) {
      trkInfo = cookies[i].substring(8);
    }
  }

  if (window.location.protocol == "http:") {
    // If "sl" cookie is set, redirect to https.
    for (var i = 0; i < cookies.length; ++i) {
      if ((cookies[i].indexOf("sl=") == 0) && (cookies[i].length > 3)) {
        window.location.href = "https:" + window.location.href.substring(window.location.protocol.length);
        return;
      }
    }
  }

  // Get the new domain. For international domains such as
  // fr.linkedin.com, we convert it to www.linkedin.com
  // treat .cn similar to .com here
  var domain = location.host;
  if (domain != "www.linkedin.com" && domain != "www.linkedin.cn") {
    var subdomainIndex = location.host.indexOf(".linkedin");
    if (subdomainIndex != -1) {
      domain = "www" + location.host.substring(subdomainIndex);
    }
  }

  window.location.href = "https://" + domain + "/authwall?trk=" + trk + "&trkInfo=" + trkInfo +
      "&originalReferer=" + document.referrer.substr(0, 200) +
      "&sessionRedirect=" + encodeURIComponent(window.location.href);
}
</script>
</head></html>

Traceback (most recent call last):
  File "/Users/alvaroserranorivas/projects/MMG-linkedin/main.py", line 60, in <module>
    data = res.json()
  File "/Users/alvaroserranorivas/projects/MMG-linkedin/.venv/lib/python3.9/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/Users/alvaroserranorivas/.pyenv/versions/3.9.6/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Users/alvaroserranorivas/.pyenv/versions/3.9.6/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/alvaroserranorivas/.pyenv/versions/3.9.6/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Has any progress being made so far regarding this endpoint at a certain commit, branch..?

abinpaul1 commented 3 years ago

The requests library automatically takes care of the url encoding for you. This error you have got is because the data returned was not a json. You have misspelled UpdatesV2 as UpdatedV2 as well in the endpoint url. This is a sample of what works for me. I have hardcoded a profileUrn, you can change that as needed

  def get_profile_posts(self):
      params = {
          "includeLongTermHistory" : True,
          "moduleKey" : "member-shares:phone",
          "q": "memberShareFeed",
          "count": 10,
          "start": 0,
          "profileUrn" : "urn:li:fsd_profile:ACoAAAaIVpEB4s1ycBZHbK2o6ieWGJYmqHzBZUY"
      }
      res = self._fetch(f"/identity/profileUpdatesV2", params=params)
      data = res.json()

      pagination_token = data['metadata']['paginationToken']

      # Modify params and include pagination_token as well
      params["start"] = 10
      params["paginationToken"] = pagination_token

      res = self._fetch(f"/identity/profileUpdatesV2", params=params)
      data2 = res.json()

      # Repeat to get next 10 posts
alvaroserrrano commented 3 years ago

Thanks. Just submitted a PR https://github.com/tomquirk/linkedin-api/pull/181

abinpaul1 commented 3 years ago

Thanks to @alvaroserrrano for adding this feature 🎉🎉

romain130492 commented 2 years ago

is this still working today?.

abinpaul1 commented 2 years ago

@romain130492 It should be. Are you facing issues or errors when using the same?

sumankwan commented 1 year ago

is this still working?

miniquinox commented 11 months ago

Please post a minimalistic sample code to get the latest 3 posts by user Bill Gates. I am very lost!

jahanzaibbabar commented 11 months ago

You can get details of the posts from the profile by giving his urn_id or public_id. The sample code to get the first 10 posts of the user having public_id jahanzaib-babar is:

Code

from linkedin_api import Linkedin
from pprint import pprint
import json

api = Linkedin('xxxxx@gmail.com', 'password****')

data = api.get_profile_posts(public_id="jahanzaib-babar", post_count=10)

pprint(data)

with open('data.json', 'w') as file:
    json.dump(data, file)

Note: You can get a maximum of 100 posts from any user by giving its urn_id or public_id.

miniquinox commented 11 months ago

It seems like the package is not being detected. Any suggestions on what I'm doing wrong?

image