outscraper / outscraper-python

The library provides convenient access to the Outscraper API from applications written in the Python language. Allows using Outscraper's services from your code.
https://outscraper.com
MIT License
69 stars 18 forks source link

Google Maps client fails to fetch results #18

Closed Almaroo closed 1 day ago

Almaroo commented 2 days ago

Affected versions: 5.1.0, 5.3.3

Tldr: Google Maps response schema changed/is broken? Description: Around yesterday I noticed that client started malfunctioning. Im calling my function in the following manenr:


scraped_fields = [
        "query",
        "place_id",
        "name",
        "latitude",
        "longitude",
        "street",
        "city",
        "postal_code",
        "phone",
        "email",
        "working_hours",
        "business_status",
    ]

return self.client.google_maps_search(
            query=query,
            language="PL",
            limit=limit,
            fields=self.scraped_fields,
            async_request=False,
            drop_duplicates=True
        )[0]

Prior to circa yesterday the result of calling this for query: bar, Kraków, Polska was something like this:

[
  {
    "query": "bar, Kraków, Polska",
    "name": "Bar A", ...
  },
  {
    "query": "bar, Kraków, Polska",
    "name": "Bar B", ...
  },
]

However yesterday when I ran client I was a little bit surprised to see that result of this call now looks like this:

['city', 'name', 'latitude', 'street', 'business_status', 'place_id', 'phone', 'query', 'working_hours', 'longitude', 'postal_code', 'street', 'latitude', 'phone', 'longitude', 'name', 'query', 'business_status', 'working_hours', 'city', 'place_id', 'postal_code', 'street', 'latitude', 'phone', 'longitude', 'name', 'query', 'business_status', 'working_hours', 'city', 'place_id', 'postal_code', 'city', 'name', 'latitude', 'street', 'business_status', 'place_id', 'phone', 'query', 'working_hours', 'longitude', 'postal_code', 'city', 'name', 'latitude', 'street', 'business_status', 'place_id', 'phone', 'query', 'working_hours', 'longitude', 'postal_code']

Should you need any further details from me please let me know - Ill be happy to help you.

EDIT: I noticed that I was running package being 2 minor versions late. I bumped it to the latest but unfortunately no luck.

Best regards,

vlad-stack commented 1 day ago

[{'street': 'Bożego Ciała 12 ½', 'latitude': 50.051604, 'phone': '+48 513 600 538', 'longitude': 19.9435629, 'name': 'William Rabbit & Co', 'query': 'bar, Kraków, Polska', 'business_status': 'OPERATIONAL', 'working_hours': {'poniedziałek': '18:00-00:00', 'wtorek': '18:00-00:00', 'środa': '18:00-00:00', 'czwartek': '18:00-01:00', 'piątek': '18:00-02:00', 'sobota': '18:00-02:00', 'niedziela': '18:00-00:00'}, 'city': 'Kraków', 'place_id': 'ChIJLz9MRlJbFkcRaMvKYMQJNCc', 'postal_code': '31-059'}]

vlad-stack commented 1 day ago

this is the response I get when I call the endpoint

vlad-stack commented 1 day ago
YOUR_API_KEY = 'API_KEY'

from outscraper import ApiClient

api_client = ApiClient(api_key=YOUR_API_KEY)

scraped_fields = [
       "query",
       "place_id",
       "name",
       "latitude",
       "longitude",
       "street",
       "city",
       "postal_code",
       "phone",
       "email",
       "working_hours",
       "business_status",
]

r = api_client.google_maps_search(
    query='bar, Kraków, Polska',
    language='PL',
    limit=500,
    fields=scraped_fields,
    async_request=False,
    drop_duplicates=True
)
print(r)
vlad-stack commented 1 day ago

I cannot reproduce the problem, please give it another try

Almaroo commented 1 day ago

ok, I see when I omit indexer [0] it indeed looks fine. Im unsure why it was there in the first place in our codebase but it for sure worked before :D Ill double check it within my team, thanks a lot

Almaroo commented 1 day ago

Ok figured it out, we didnt use drop_duplicates = True from start and result was a nested list, thats why [0] was used. When called with this flag response seems to be flattened.

Edit:

Just like your docs say drop_duplicates (bool): parameter specifies whether the bot will drop the same organizations from different queries. Using the parameter combines results from each query inside one big array.