BaseMax / GooglePlayWebServiceAPI

Tiny script to crawl information of a specific application in the Google play/store base on PHP.
MIT License
37 stars 9 forks source link

add lang & loc params to parseApplication and parsePerms (see #8) #9

Closed IzzySoft closed 3 years ago

IzzySoft commented 3 years ago

Adding the parameters is easy. Dealing with the results seems a different thing:

But your "game matching" (type) will fail for everything but English, which results in "no games, all apps"; this might be alleviated by instead of the "visible description" that array would build upon the basename($url), like /store/apps/category/PRODUCTIVITY=>'PRODUCTIVITY' – and matching that against $id. That would require redefining private $categories.

Please check for yourself and say what you think. I've marked this WIP as at least the "type matching" should be resolved before merging. Further I'm not sure whether those "appended translations" should be dealt with; a

$description = preg_replace('!.*<div jsname="Igi1ac" style="display:none;">(.*)!ims','$1',$description)`;

could do that (and leave the string untouched when that DIV is not present).

closes #8

IzzySoft commented 3 years ago

OK, matching on ID is easy – no more looping, going by prefix:

Now for the reason I never implemented multilang in my class:

    [lastUpdated] => 
    [versionName] => 
    [minimumSDKVersion] => 
    [installs] => 
    [age] => 
    [rating] => 3,0
    [votes] => 568
    [price] => 0
    [size] => 

Missing values are preg_matched by their English terms – so they are no longer matching when using a different language. Let's see if I can get that tackled, too…

IzzySoft commented 3 years ago

Well, won't win any beauty price – but as it's mostly "tech values" (not counting the "Varies with device" on VersionName and Size with some apps), it might be acceptable (I hope)?

Still needs some testing. I'd welcome your help with that, if you have some examples in mind:

IzzySoft commented 3 years ago

OK, tested a bit more – several additional apps as well as languages (de, en, ru, es). Looks good so far. Open points:

So you've got the last word. I'll remove the WIP now (just noticed Github handles that differently: at GitLab, a "WIP:" prefix automatically disables the merge button, here I'll need "draft" – next time). Feel free to merge when you think it's ready – or have me "clean" the description first.

BaseMax commented 3 years ago

You excited me. Thank. that was perfect.

What happened to the categories?

Just need a change: public function parseApplication($packageName,$lang='en_US',$loc='US') { to: public function parseApplication($packageName, $lang='en_US', $loc='US') {

IzzySoft commented 3 years ago

You excited me. Thank. that was perfect.

Glad you like it! Was trickier than thought (I had totally forgotten the parsing at least partially depends on language-specific terms) :rofl:

What happened to the categories?

You mean the array at the beginning? No longer needed. If you look at the category IDs: all game cats start with GAME. So why keep a list we might have to check from time to time, if we can simply match by prefix? Actually, that introduced a 3rd type, rarely showing up I guess: family.

One thing I wonder (and need to watch): according to our code, each app has exactly 1 category. Is that true? Can't it have multiple categories? I've never kept close watch on that. At F-Droid, an app can have multiple categories.

PS: if you wonder for those 3 sneaked-in commits on permission stuff: the class is now integrated with my framework, replacing my (now broken) permission parser. So that is mass-tested now – which brought up the two "exceptional behaviors" (e.g. "other others group", saw that in 1 out of 100 apps; another 1 timer was an "empty permission group entry" right at the beginning). In other words: permission retrieval should be quite stable now.

Just need a change:

Will do that with the next PR. What was next? Ah, the iteration for summary. Added the spacing to my todo list for now. Will see to tackle it this week. Together with more "mass-testing" :smiley: First automated run was this morning, with about 300 "random" apps. Nothing suspicious in the logs. Good sign.

IzzySoft commented 3 years ago

PS: What's your stance on "PHPDoc"? I thought to add a small header on top of the methods, like

  /**
   * @method parsePerms
   * @param string packageName    package name of the app to check, e.g. "com.example.app"
   * @param optional string lang  ISO 639-1 language code. Default: "en"
   * @return array perms          with the keys [describing the structure in short]
   */
  public function parsePerms($packageName, $lang='en') {

Always good to document stuff. To my experience this is helpful if you later need to fix up/adjust/add things. Besides there are also generators for API docs which can utilize that.

BaseMax commented 3 years ago

What was next? Ah, the iteration for summary.

Yeah, Thank you so much.

What's your stance on "PHPDoc"?

Yes, This is a good thing. @IzzySoft

Comments will help us a lot in the future. :+1: In particular: arguments and return type

IzzySoft commented 3 years ago

Will be done. Let's move that to its own PR; I'll start that straight away (so you can follow progress) but set it to draft (so you don't merge it before it's ready :smile:)