serpapi / public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)
47 stars 3 forks source link

[Google Search API] Scrape `knowledge_graph` entity #926

Closed marm123 closed 4 months ago

marm123 commented 1 year ago

I'm unsure if this information is accessible via regular Google Search, but one of our customers requested that we scrape entity for knowledge_graph similarly to how it's presented in Google API:

image image

Sample Google API request | Knowledge Graph entities Google Search knowledge graph 1 | Google Search knowledge graph 2 Intercom

johnstetic commented 1 year ago

This would be beneficial information. One of the compelling aspects of the Knowledge Graph is that it carries a global taxonomy for things. All the details that the SERP API provides are great, but not having information on the group the thing belongs to seems like missing data.

schaferyan commented 1 year ago

I'm bumping this up to next, as the customer followed up and is looking for a definitive answer as to whether or not we can scrape this.

aciddjus commented 1 year ago

Unfortunately, there is no way for us to know the exact schema structure of the Google Knowledge Graph. That data is not provided on the page.

We could extract the following data. It is not the exact schema that you can find in Google docs and is only provided for a few parts of the knowledge graph.


kc:/people/person

image

kc:/film/film

image

kc:/music/artist

image

kc:/business/business_operation kc:/organization/organization

image

@johnstetic Let us know if this is something that could be useful for you.

johnstetic commented 1 year ago

@aciddjus, this information may be helpful. Though if it's not available for all parts of the KG it creates some added complexity on our side. We are doing some further investigation on what we need vs. what is nice to have. I'll come back to you when I have a clearer view. I really appreciate you investigating this; it's really helpful.

johnstetic commented 10 months ago

@aciddjus, after reviewing a few options, this information could be really helpful. We are doing some hacky REGEX parsing of the current "type" field right now it's only somewhat reliable. The additional details would be very helpful.

btaunt commented 4 months ago

This feature has been added.

SerpApi now parses the entity_type within the knowledge_graph if available.

Example of search for "Messi":

Screenshot 2024-03-14 at 12 53 55 PM

Other Playground Examples: Example 1 Example 2 Example 3