konovalov-maksim / play_market_parser

Google Play data parser: detailed apps info scraping, apps positions detection, searching suggestions collecting
MIT License
26 stars 5 forks source link

app_icon Google Play Scraper

Read this in another languages: en English, ru Русский.

Google Play Scraper is a tool that allows you to mine data from Google Play. Collected information can be used for analytics, app's promotion strategy formation or analysis of competitors.

download Download release (v1.3.2): GooglePlayScraper-1.3.2.exe + JRE (36.05 Mb) | GooglePlayScraper-1.3.2.jar (11.61 Mb)

Google Play Scraper has 4 modes:

Features:

apps-parsing-en

Modes

Search suggestions collection

The mode allows you to collect tips from the Google Play search line. At the input you should submit a list of queries. Collected tips can be used to analyze user demand or to perform app's page text optimization.

searching_tips

It can be collected not only the first 5 most popular tips for each query, but also all frequently used continuations of query. If 5 tips are found for some original query, subqueries will be formed for the remaining tips searching. These subqueries iterate all letters of the alphabet. Alphabet for iterating detects automatically (by input query), or it can be set manually. Max parsing depth (i. e. the max number of characters that can be added to the original query) set in Preferences. Set it to 1 if you want to collect tips only directly by an original query (no more than 5 for each).

The algorithm does not collect the tips generated for the corrected query. For example, for the query "facebook ma" tip "facebook message app" won't be collected as it corresponds to corrected query, not original.

tips-en

Apps positions check

This mode allows you to check apps' positions on Google Play search engine results page. Collected data can be used to analyze the dynamics of visibility of the application.

Sometimes, instead of displaying the app on the actual position, Google Play "raises" or "lowers" it. Therefore, to get the real position, it is recommended to set the number of checks for each request in the range 3-5.

The exported CSV file with the results can be imported for re-checking the positions (eg the next day). In this case, if you select "Include previous results" checkbox, the new results will be recorded in a new column to the right of the old, making it easier to analyze the data.

pos-en

Apps searching

This mode allows you to collect links and basic apps' info found by specified queries. The following apps' info is collected:

apps-col-en

Detailed info collection

This mode allows you to extract detail apps' info using list of URLs. The following apps' info is collected:

Preferences

For each mode it's possible to set the language and country, the number of threads, connection timeout, proxy server, and http headers user-agent, accept-language.
If no country is specified, Google Play will detect it by IP (or proxy server IP).
If language not specified, Google Play will detect it by http header accept-language.

prefs-en