JoMingyu / google-play-scraper

Google play scraper for Python inspired by <facundoolano/google-play-scraper>
MIT License
726 stars 197 forks source link

[BUG] ReviewId now coming in two different formats #131

Open bromero-aviv opened 2 years ago

bromero-aviv commented 2 years ago

google_play_scraper.VERSION 1.1.0

Describe the bug Since 2 days ago, I'm getting different id's (different formats) for the same reviews. Some in the fomrat gp:AOqpTOEo4G_OphpsFbn5WQKTFrGDKK... and some in the format: 36ff9405-0c1e-4d59-b213-2000afeb87f9

Code

Iterate over different languages and countries:

results = reviews( 'com.android.chrome'., lang=language, # defaults to 'en' country=country, # defaults to 'us' sort=Sort.NEWEST, # defaults to Sort.MOST_RELEVANT count=25) print(results[:10]['reviewId'])

Expected behavior

0 gp:AOqpTOHEb48yO6K3CTJie843pxccgKPgwGa5Zjc94qY... ...
NaT 1 gp:AOqpTOEQbgjHZe-gfArY-rPH8rUSVU1vz_rKDa6YXlL... ...
NaT 2 gp:AOqpTOEc4kPaXyBH5oLzKIZeJyKH4iH-mbwHcytF1EX... ...
NaT 3 gp:AOqpTOE0jzhfiutRcSK4FJihd_6cRW6yQ4g54TJHJ6m... ...
NaT 4 gp:AOqpTOGEyN0KxOJ7Ec49yslGIep4WCeopmkafDR2KoA... ...
NaT 5 gp:AOqpTOHhZTiolkWtBRUP1wg6MaKM3XSXiJB8L7So2wO... ...
NaT 6 gp:AOqpTOGpvX648wHo_KrfN_8GqpWVPqd9TRXuph33LFd... ...
NaT 7 gp:AOqpTOH2M23hl33ZokSEoySdvpBAyAXjTum-dFrWRca... ... 8 gp:AOqpTOGm3y5e-BSRZgN6Egnc8lQ42n61pQqKvDe82GZ... ...
NaT 9 gp:AOqpTOHwOAVcGrgaqvlgG6BneaggZsok8LjdF8hIu5o... ...
NaT

Additional context I upgraded to 1.1.0 10 days ago, but this only started happening yesterday.

bromero-aviv commented 2 years ago

Even easier to test: Try to run your own test_reviews.py unittest and it will fail at: self.assertTrue( r["reviewId"].startswith("gp:AOqp") or r["reviewId"].startswith("lg:AOqp") )

erengul94 commented 2 years ago

I am also facing this problem as well. Just a note, first, they were two different formats as mention above. When try to run today, it becomes completly, "36ff9405-0c1e-4d59-b213-2000afeb87f9" this format

erengul94 commented 2 years ago

by the way, new format works while reply the content, so it's kind of proof that new format is valid as well.

kluhan commented 2 years ago

does anyone have an idea how to infer from the old ID to the new one? would be nice if it was just a hash function but so far i haven't found anything that fits.