Closed monk1337 closed 2 years ago
Hi, @monk1337 👋
Can I directly call this URL https://scholar.google.com/scholar?q=related:gemrYG-1WnEJ:scholar.google.com/&scioq=Multi-label+text+classification+with+latent+word-wise+label+information&hl=en&as_sdt=0,21 using serp API?
Yes you can. You just need to pass the q=
URL value to SerpApi q
search parameter. In the case of the URL provided by you, SerpApi q
parameter would be: related:gemrYG-1WnEJ:scholar.google.com
:
You can retrieve data directly from URL using only CURL and JQ. We have a #AskSerpApi
episode that covers specifically this question: #AskSerpApi: "How to extract a specific element from the JSON URL?" | CURL + JQ.
To extract related articles, you need to access ["organic_results"]["inline_links"]["related_pages_link"]
:
Example code to extract related articles from the first page:
from serpapi import GoogleSearch
params = {
"api_key": "...",
"engine": "google_scholar",
"q": "Coffee",
"hl": "en"
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results["organic_results"]:
related_articles = result["inline_links"]["related_pages_link"]
print(related_articles)
Outputs (where q=
is a related articles search query that can be passed to SerpApi q
search parameter):
https://scholar.google.com/scholar?q=related:sWzmct-yYzgJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:9WouRiFbIK4J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:fGeQlvu-2_IJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:-0fOFoq7wJ8J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:CZSAb_VNDkkJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:Jt15QwxlEw0J:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:31GOrHWBl_AJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:KVT-hW9IrDoJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:Ang0MOfBmAUJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
https://scholar.google.com/scholar?q=related:QwF9cuvhnCoJ:scholar.google.com/&scioq=Coffee&hl=en&as_sdt=0,11
To access related articles with SerpApi, you can do it like so (keep in mind that this is an example):
from serpapi import GoogleSearch
import re
def get_related_articles_query():
params = {
"api_key": "...",
"engine": "google_scholar",
"q": "Multi-label text classification",
"hl": "en"
}
search = GoogleSearch(params)
results = search.get_dict()
related_articles = []
for result in results["organic_results"]:
# https://regex101.com/r/XuEhoh/1
related_article = re.search(r"q=(.*)\/&scioq", result["inline_links"]["related_pages_link"]).group(1)
related_articles.append(related_article)
return related_articles
def get_related_articles_results():
for related_article in get_related_articles_query():
params = {
"api_key": "...",
"engine": "google_scholar",
"q": related_article, # related:sWzmct-yYzgJ:scholar.google.com ...
"hl": "en"
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results["organic_results"]:
print(result.get("title"), result.get("link"), result.get("publication_info", {}).get("summary"), sep="\n")
get_related_articles_results()
Outputs:
Deep learning for extreme multi-label text classification
https://dl.acm.org/doi/abs/10.1145/3077136.3080834
J Liu, WC Chang, Y Wu, Y Yang - … of the 40th international ACM SIGIR …, 2017 - dl.acm.org
...
Code example in the online IDE: https://replit.com/@DimitryZub1/Google-Scholar-SerpApi-API-Extract-Related-Articles
Let me know if it makes sense and if you need additional clarifications 🌼
@monk1337 We've added a newserpapi_related_pages_link
dict
key from JSON response:
So now there's no need to use regex to extract search query:
re.search(r"q=(.*)\/&scioq", result["inline_links"]["related_pages_link"]).group(1)
Let me know if you need any additional help 🙂
Closing this as we implemented it. For more: https://serpapi.com/google-scholar-api
I am using SERP API to fetch google scholar papers, although there is always a link called "related articles' under each article but SERP API doesn't have any SERP URL to fetch data of those links?
Serp API result :
Can I directly call this URL
https://scholar.google.com/scholar?q=related:gemrYG-1WnEJ:scholar.google.com/&scioq=Multi-label+text+classification+with+latent+word-wise+label+information&hl=en&as_sdt=0,21
using serp API?