Open agusmdev opened 1 year ago
I am noticing similar behavior, but it probably happens 70% of the time. I'm not quite sure how to resolve this issue.
Got the same problem too
Same problem here
Did someone found a solution to replace the current code?
res = self._fetch(
f"/search/blended?{urlencode(default_params, safe='(),')}",
headers={"accept": "application/vnd.linkedin.normalized+json+2.1"},
)
data = res.json()
Same problem :(
Please update if anyone has any resolution on this.
I am also willing to work on this with anyone who is interested maybe on call, let me know.
same here, toying around with it atm
@agusmdev do you happen to know where in the documentation it mentions the new endpoint?
@agusmdev do you happen to know where in the documentation it mentions the new endpoint?
Nowhere, I checked that with my Linkedin account executing a search query from the browser
gotcha, was digging through the docs and I wasn't able to find anything so makes sense, I'll take a look there
The code below is my basic implementation of getting list of first 10 employees (because 1 request returns exactly that, so offset can be used to request not from 1st employee, but from 10th for example), parsing it and returning some basic data. Very little is parsed since I don't need all the data. But I think this might be a good starting point.
def fetch_employees(company_id, offset=0):
cache = f"companies/{company_id}/employees_{offset}.json"
if os.path.exists(cache):
r = json.loads(open(cache).read())
print(f"[get_employees()]: OK! Using cached file \"{cache}\".")
else:
uri = f"/graphql?includeWebMetadata=true&variables=(start:{offset},origin:COMPANY_PAGE_CANNED_SEARCH,query:(flagshipSearchIntent:SEARCH_SRP,queryParameters:List((key:currentCompany,value:List({company_id})),(key:resultType,value:List(PEOPLE))),includeFiltersInResponse:false))&&queryId=voyagerSearchDashClusters.b0928897b71bd00a5a7291755dcd64f0"
r = API._fetch(uri)
if not r.ok:
print(f"[fetch_employees()]: Fail! LinkedIn returned status code {resp.status_code} ({r.reason})")
return
print(f"[fetch_employees()]: OK! LinkedIn returned status code {r.status_code} ({r.reason})")
r = r.json()
# Cache request
os.makedirs(f"companies/{company_id}", exist_ok=True)
with open(cache, "w") as f:
json.dump(r, f)
if not r["data"]["searchDashClustersByAll"]:
print(f"Bad json. LinkedIn returned error:", r["errors"][0]["message"])
os.remove(cache)
return
return r["data"]["searchDashClustersByAll"]
def get_employees(company_id, offset=0):
def get_item_key(item, keys):
if type(keys) == str:
keys = [keys]
cur = item
for key in keys:
if cur and key in cur.keys():
cur = cur[key]
else:
return ""
return cur
j = fetch_employees(company_id)
if not j:
return []
if not j["_type"] == "com.linkedin.restli.common.CollectionResponse":
return []
employees = []
for it in j["elements"]:
if not it["_type"] == "com.linkedin.voyager.dash.search.SearchClusterViewModel":
continue
for it in it["items"]:
if not it["_type"] == "com.linkedin.voyager.dash.search.SearchItem":
continue
e = it["item"]["entityResult"]
if not e or not e["_type"] == "com.linkedin.voyager.dash.search.EntityResultViewModel":
continue
try:
#print("\nEmployee:")
#print(" ", get_item_key(e, ["title", "text"]))
#print(" ", get_item_key(e, "entityUrn"))
#print(" ", get_item_key(e, ["primarySubtitle", "text"]))
#print(" ", get_item_key(e, ["secondarySubtitle", "text"]))
employees.append({
"title": get_item_key(e, ["title", "text"]),
"entityUrn": get_item_key(e, "entityUrn"),
"primarySubtitle": get_item_key(e, ["primarySubtitle", "text"]),
"secondarySubtitle": get_item_key(e, ["secondarySubtitle", "text"]),
})
except Exception as e:
print(f"Exception {e} while processing employees of id {company_id}")
exit(1)
return employees
Is this code working?.
Is this code working?.
It is working for me in my program :)
Oh okay, इ I will try on my end, if it doesn’t will you be able to connect with me on. A meet?
Oh okay, इ I will try on my end, if it doesn’t will you be able to connect with me on. A meet?
I don't think I'm the right person to answer these kind of questions ;). All I did in my code is pure guessing + looking at lots of json requests. But if anything less serious happens, you could try writing here, so it will also help others if they stumble upon the same problem.
What is the company_id here?
Wrote a small function for getting company_id
def getCompanyID(company_link):
try:
company_username = company_link.split('.com/company/')[1].replace('/','')
except:
print("Wrong Company URL. Company Format should be https://www.linkedin.com/company/company_Username/!")
return None
api_link = 'https://www.linkedin.com/voyager/api/organization/companies?decorationId=com.linkedin.voyager.deco.organization.web.WebCompanyStockQuote-2&q=universalName&universalName={}'.format(quote(company_username))
resp = api._get(api_link).json()
company_id = resp.get('elements')[0].get('entityUrn').split(':')[-1]
return company_id
company_id is numerical id of the company (google = 1441, facebook = 76987811). It can be retreived as urn from linkedin_api and then converted to numerical id using built-in helper function
Example snippet:
from linkedin_api.utils import helpers
company = API.get_company("google")
company_id = helpers.get_id_from_urn(company["entityUrn"])
employees = get_employees(company_id)
# Print name of first 10 employees
for e in employees:
print(e["title"])
PS: There were some minor typos in my initial code (https://github.com/tomquirk/linkedin-api/issues/313#issuecomment-1574333025) which I fixed already. So just re-paste it.
might be a little out of the loop here, how does this code fix the search
function?
as an update, I think was able to get the mappings right for the new search endpoint. The current tests show that 16/24 tests fail that are all tied to the search
function, so I'll be forking the repo and seeing if I can bring it back up to 24/24.
might be a little out of the loop here, how does this code fix the
search
function?
It doesn't. I wanted to use search_people in my project, but it was broken. So I wrote my own small variation of it and posted it in case someone needed it. It can output only minimal information, but that's okay for me, since that was all I needed. If anyone needs more than that, I thought that code would've been a nice little foundation.
I used this endpoint 2 days ago and it was working correctly, but it seems LinkedIn updated their API, and now instead of using the endpoint
/search/blended
they use/graphql?variables=....
.
Hey! Where can I find the information about this new API? Share the links to the reference please
I used this endpoint 2 days ago and it was working correctly, but it seems LinkedIn updated their API, and now instead of using the endpoint
/search/blended
they use/graphql?variables=....
.Hey! Where can I find the information about this new API? Share the links to the reference please
https://github.com/tomquirk/linkedin-api/issues/313#issuecomment-1573709203
The output from the new endpoint is a little confusing, anyone know how to make sense of it? seems like it's returning multiple attributes that come together to make a single profile on the website.
I've noticed they are using 2 endpoints to get people by different params. First one returns only urn ids and second one returns list of profiles by list of urn ids. I can fetch urn ids but for some reason second endpoint returns me 400. Probably it has some specific headers or something idk for now.
My solution: `
def graphql_search_people(
self,
job_title: str,
regions: list[str],
limit: int | None,
offset: int
) -> list[dict]:
"""Get list of user's urns by job_title and regions."""
count = Linkedin._MAX_SEARCH_COUNT
if limit is None:
limit = -1
results = []
while True:
# when we're close to the limit, only fetch what we need to
if limit > -1 and limit - len(results) < count:
count = limit - len(results)
default_params = {
"origin": "FACETED_SEARCH",
"start": len(results) + offset,
}
res = self._fetch(
(f"/graphql?variables=(start:{default_params['start']},origin:{default_params['origin']},"
f"query:(keywords:{job_title},flagshipSearchIntent:SEARCH_SRP,"
f"queryParameters:List((key:geoUrn,value:List({','.join(regions)})),"
f"(key:resultType,value:List(PEOPLE))),"
f"includeFiltersInResponse:false))&=&queryId=voyagerSearchDashClusters"
f".b0928897b71bd00a5a7291755dcd64f0"),
headers={"accept": "application/vnd.linkedin.normalized+json+2.1"},
)
logger.debug(res.text)
data = json.loads(res.text)
new_elements = []
elements = data.get("included", [])
logger.debug(f"Profile urns: {elements}")
for i in range(0, 10):
new_elements.append(elements[i]["entityUrn"])
results.extend(self._get_people_by_urns(urns=new_elements))
# break the loop if we're done searching
# NOTE: we could also check for the `total` returned in the response.
# This is in data["data"]["paging"]["total"]
if (
(-1 < limit <= len(results)) # if our results exceed set limit
or len(results) / count >= Linkedin._MAX_REPEATED_REQUESTS
) or len(new_elements) == 0:
break
self.logger.debug(f"results grew to {len(results)}")
return results
def _get_people_by_urns(self, urns: list[str]) -> list[dict]:
"""Get profiles info by urns."""
profiles = []
for urn in urns:
clear_urn = urn.split(":")[-1]
profiles.append(self.get_profile(urn_id=clear_urn))
return profiles`
URL to fetch profiles (always returns 400):
https://www.linkedin.com/voyager/api/graphql?variables=(lazyLoadedActionsUrns:List(urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAAA6ZpN0B-fPBL3atd5cCsIS9cl7w3zXLylw,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAAAAJcNcBZWx8gvYiUs_1cLtFiwXhXoNQihc,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAAAD-cOsB2wB0EldN_R22uvya2ZcYuefBKPI,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAAADEXysBWdPqwfO-p8MyOQOwaWMB2qO0Umg,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions:(urn:li:fsd_profileActions:(ACoAAAO9jNABuhihN_wVSgFGgDry9xrGYM-cmzU,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAAAfUQGcBG3VTivwWqKm9Gw5g8F3Rt8gUwQ8,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions:(urn:li:fsd_profileActions:(ACoAABdSlasBRb9Dp9rwdkpKS3_atJQPLkAt0jY,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAABfd6ZoBNCHS45DdfDVHMABssw9S57AH4-Y,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAACKT0KABrXki4zf6VnGenRUxSBmG-udwtag,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP),urn:li:fsd_lazyLoadedActions: (urn:li:fsd_profileActions:(ACoAACWuoz0BNF2Tcij9PyIymEc65yt_mlrzAfk,SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP))) &=&queryId=voyagerSearchDashLazyLoadedActions.9efa2f2f5bd10c3bbbbab9885c3c0a60
The output from the new endpoint is a little confusing, anyone know how to make sense of it? seems like it's returning multiple attributes that come together to make a single profile on the website.
Someone has find a solution ?
No luck on my end, while I was testing it the linkedin account I was using became restricted so I just went ahead and did a one time script for what I needed to do.
Best, -Mo Co-Founder & CTO, Abstract.us ( http://abstract.us/ )
On Tue, Jun 13, 2023 at 9:36 AM, TanguyBellec < @.*** > wrote:
Someone has find a solution ?
— Reply to this email directly, view it on GitHub ( https://github.com/tomquirk/linkedin-api/issues/313#issuecomment-1589334015 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AHPI4H37NRDSV5HD4G5MVXDXLBUHJANCNFSM6AAAAAAYPQBKEU ). You are receiving this because you commented. Message ID: <tomquirk/linkedin-api/issues/313/1589334015 @ github. com>
i found this from linkedin https://github.com/linkedin-developers/linkedin-api-python-client/blob/main/linkedin_api/clients/restli/utils/encoder.py https://github.com/linkedin-developers/linkedin-api-python-client/blob/main/linkedin_api/clients/restli/utils/decoder.py it should help formatting url params and to better understand what each request is doing
Same, not working always returns empty list !
any news on the LinkedIn Search. It's very important feature. Thanks for contribution
any news on the LinkedIn Search. It's very important feature. Thanks for contribution
Still down unfortunately :/
I created a draft PR with the changes suggested by @17314642 and @Timur-Gizatullin + a few modifications.
The search_people
and search_companies
endpoint work for me with these changes and the parameters of my use case but I haven't tested all the other combinations.
Feel free to add any improvements or suggest changes! I might take a look again at it if I get some time and try to do a cleaner fix, if there is one.
I created a draft PR with the changes suggested by @17314642 and @Timur-Gizatullin + a few modifications. The
search_people
andsearch_companies
endpoint work for me with these changes and the parameters of my use case but I haven't tested all the other combinations.Feel free to add any improvements or suggest changes! I might take a look again at it if I get some time and try to do a cleaner fix, if there is one.
Please push it soon
company_id is numerical id of the company (google = 1441, facebook = 76987811). It can be retreived as urn from linkedin_api and then converted to numerical id using built-in helper function
Example snippet:
from linkedin_api.utils import helpers company = API.get_company("google") company_id = helpers.get_id_from_urn(company["entityUrn"]) employees = get_employees(company_id) # Print name of first 10 employees for e in employees: print(e["title"])
PS: There were some minor typos in my initial code (#313 (comment)) which I fixed already. So just re-paste it.
@17314642 Is this "get" still working? I get the following error when trying to run it:
PS` C:\LinkedIn\linkedin-api> python SearchID.py Traceback (most recent call last): File "C:\LinkedIn\linkedin-api\SearchID.py", line 41, in
company = api.get_company("current_cia") File "C:\LinkedIn\linkedin-api\linkedin_api\linkedin.py", line 975, in get_company self.logger.info("request failed: {}".format(data["message"])) KeyError: 'message'
Hey everyone.
Can y'all try version 2.1.1 and let me know if it fixes any issues?
The output from the new endpoint is a little confusing, anyone know how to make sense of it? seems like it's returning multiple attributes that come together to make a single profile on the website.
Maybe someone else here will find this information useful, so I'll just leave it here.
def walk_through_data(obj, val_dict):
if isinstance(obj, dict) and obj:
keys_to_remove = [k for k in obj if k[0] == '*']
for k in keys_to_remove:
old_val = obj.pop(k)
new_val = [val for val in val_dict if val['entityUrn'] == old_val]
if new_val:
obj[k[1:]] = new_val[0]
# val_dict.remove(new_val[0])
else:
print(f'Could not find: {old_val}')
for v in obj.values():
walk_through_data(v, val_dict)
elif isinstance(obj, list) and obj:
for elem in obj:
walk_through_data(elem, val_dict)
with open('./request_sessions/test_response_posts_all_old_voyager5(no_headers)0.json', "r", encoding='utf-8') as json_file:
response = json.load(json_file)
feed_key = 'feedDashProfileUpdatesByMemberShareFeed' # What type of data was received in the response
initial_type = 'com.linkedin.voyager.dash.feed.Update' # Find what type of object the root data has
included_data = response['included']
response_data = response['data']['data'][feed_key]['*elements']
initial_data = [p for p in included_data if p['$type'] == initial_type and p['entityUrn'] in response_data]
for d in initial_data:
walk_through_data(d, included_data)
response.pop('included')
response.pop('meta')
response['data'] = response['data']['data']
response['data'][feed_key]['elements'] = initial_data
with open(f'./request_sessions/test_response_posts_all_COMBINE.json.json', "w", encoding='utf-8') as json_file:
json.dump(response, json_file, indent=4)
I used this endpoint 2 days ago and it was working correctly, but it seems LinkedIn updated their API, and now instead of using the endpoint
/search/blended
they use/graphql?variables=....
.When I use
search_people
I always get a403
response, is somebody experiencing the same issue?PS: My cookie session is working properly