tweaselORG / appstraction

An abstraction layer for common instrumentation functions (e.g. installing and starting apps, setting preferences, etc.) on Android and iOS.
MIT License
6 stars 1 forks source link

Android: Support custom APK bundle formats in `parseAppMeta()` #67

Closed baltpeter closed 1 year ago

baltpeter commented 1 year ago

Not terribly important, but since we now support installing custom APK bundle formats (#64), we should also support parsing metadata for them.

baltpeter commented 1 year ago

Turns out this is more important than I thought: Without this, CA will refuse to start an app analysis on the custom bundle formats.

baltpeter commented 1 year ago

While I'm at it, I also wanted to add support for passing an array of split APKs to parseAppMeta() (so we take AppPath<Platform>). The annoying problem here is figuring out which of the APKs is the base APK.

In CA, we currently assume that we can just take any of the APKs and get the same information for all of them:

https://github.com/tweaselORG/cyanoacrylate/blob/7bf6d54e7624308714c4acee95caac84fdc4953a/src/index.ts#L513-L515

But that isn't true. The base APK has more information than the splits (including information that we do return):

❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
config.de.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.de'

config.x86.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.x86'
native-code: 'x86'

config.xxhdpi.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.xxhdpi'

de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='2022.9.0' compileSdkVersion='31' compileSdkVersionCodename='12'
application-label:'Reisen'

For the bundle formats, that isn't a problem (all of them conveniently call the main APK base.apk). But how do I find the main APKs from an array of splits without parsing all of them?

baltpeter commented 1 year ago

Bing Chat suggested:

The base APK should have a classes.dex file that contains the app’s code, while the split APKs should only have resources or manifest file.

Let's test that. Bing Chat was also kind enough to write me this script:

#!/bin/zsh

# Check if a folder is provided as an argument
if [ -z "$1" ]; then
  echo "Please provide a folder name."
  exit 1
fi

# Loop through each app folder
for app in $1/*; do
  # Check if the app folder is a directory
  if [ -d "$app" ]; then
    # Get the bundle ID from the app folder name
    bundle_id=${app##*/}
    # Initialize a flag to indicate if a classes.dex file is found
    found=0
    # Loop through each apk file in the app folder
    for apk in $app/*.apk; do
      # Check if the apk file contains a classes.dex file using unzip -l
      unzip -l $apk classes.dex > /dev/null 2>&1
      # If it does, check the flag and the apk file name
      if [ $? -eq 0 ]; then
        # If the flag is already set, it means there are more than one classes.dex files
        if [ $found -eq 1 ]; then
          echo "Error: More than one classes.dex files found for $bundle_id"
          break
        # If the flag is not set, check if the apk file name matches the bundle ID
        else
          # If it does, set the flag to 1 and continue
          if [ ${apk##*/} = "$bundle_id.apk" ]; then
            found=1
            continue
          # If it does not, it means the classes.dex file is in the wrong apk file
          else
            echo "Error: classes.dex file is not in $bundle_id.apk for $bundle_id"
            break
          fi
        fi
      fi
    done
    # After looping through all apk files, check if the flag is still 0, which means no classes.dex file is found
    if [ $found -eq 0 ]; then
      echo "Error: No classes.dex file found for $bundle_id"
    fi
  fi
done
baltpeter commented 1 year ago

Result of running that against 3313 apps:

Error: More than one classes.dex files found for bofamily.app
Error: More than one classes.dex files found for com.alibaba.aliexpresshd
Error: More than one classes.dex files found for com.anuntis.fotocasa
Error: classes.dex file is not in com.brave.browser.apk for com.brave.browser
Error: No classes.dex file found for com.brave.browser
Error: More than one classes.dex files found for com.cyberlink.youcammakeup
Error: More than one classes.dex files found for com.ecosia.android
Error: More than one classes.dex files found for com.eisterhues_media_2
Error: classes.dex file is not in com.imo.android.imoim.apk for com.imo.android.imoim
Error: No classes.dex file found for com.imo.android.imoim
Error: classes.dex file is not in com.imo.android.imoimhd.apk for com.imo.android.imoimhd
Error: No classes.dex file found for com.imo.android.imoimhd
Error: More than one classes.dex files found for com.kiwibrowser.browser
Error: classes.dex file is not in com.lenovo.anyshare.gps.apk for com.lenovo.anyshare.gps
Error: No classes.dex file found for com.lenovo.anyshare.gps
Error: More than one classes.dex files found for com.limebike
Error: More than one classes.dex files found for com.myfitnesspal.android
Error: classes.dex file is not in com.psa.mym.mycitroen.apk for com.psa.mym.mycitroen
Error: No classes.dex file found for com.psa.mym.mycitroen
Error: classes.dex file is not in com.psa.mym.myopel.apk for com.psa.mym.myopel
Error: No classes.dex file found for com.psa.mym.myopel
Error: classes.dex file is not in com.psa.mym.mypeugeot.apk for com.psa.mym.mypeugeot
Error: No classes.dex file found for com.psa.mym.mypeugeot
Error: More than one classes.dex files found for com.qidian.Int.reader
Error: More than one classes.dex files found for com.ubercab
Error: More than one classes.dex files found for com.viber.voip
Error: More than one classes.dex files found for com.wave.keyboard.theme.diamondanimatedkeyboard
Error: More than one classes.dex files found for com.wave.livewallpaper
Error: More than one classes.dex files found for com.zhiliaoapp.musically
Error: More than one classes.dex files found for cool.wallpapers.live.keyboard.steampunk.pipes
Error: More than one classes.dex files found for cyberpunk.wallpaper.live.keyboard.sci.fi
Error: classes.dex file is not in de.dasoertliche.android.apk for de.dasoertliche.android
Error: No classes.dex file found for de.dasoertliche.android
Error: classes.dex file is not in de.motain.iliga.apk for de.motain.iliga
Error: No classes.dex file found for de.motain.iliga
Error: classes.dex file is not in de.rewe.app.mobile.apk for de.rewe.app.mobile
Error: No classes.dex file found for de.rewe.app.mobile
Error: classes.dex file is not in sg.bigo.live.apk for sg.bigo.live
Error: No classes.dex file found for sg.bigo.live
Error: classes.dex file is not in video.like.apk for video.like
Error: No classes.dex file found for video.like
Error: classes.dex file is not in video.like.lite.apk for video.like.lite
Error: No classes.dex file found for video.like.lite

There seems to be a bug with the script as all the no classes.dex found errors that I've looked at are false-positives.

baltpeter commented 1 year ago

Rather than try to debug this (I didn't see any obvious bug), I just went manually through the results (especially considering how few there were to begin with).

And indeed, there are only 17 apps that violate our assumptions, and all of them because there are multiple split APKs with classes.dex. Here is the cleaned up list:

Error: More than one classes.dex files found for bofamily.app
Error: More than one classes.dex files found for com.alibaba.aliexpresshd
Error: More than one classes.dex files found for com.anuntis.fotocasa
Error: More than one classes.dex files found for com.cyberlink.youcammakeup
Error: More than one classes.dex files found for com.ecosia.android
Error: More than one classes.dex files found for com.eisterhues_media_2
Error: More than one classes.dex files found for com.kiwibrowser.browser
Error: More than one classes.dex files found for com.limebike
Error: More than one classes.dex files found for com.myfitnesspal.android
Error: More than one classes.dex files found for com.qidian.Int.reader
Error: More than one classes.dex files found for com.ubercab
Error: More than one classes.dex files found for com.viber.voip
Error: More than one classes.dex files found for com.wave.keyboard.theme.diamondanimatedkeyboard
Error: More than one classes.dex files found for com.wave.livewallpaper
Error: More than one classes.dex files found for com.zhiliaoapp.musically
Error: More than one classes.dex files found for cool.wallpapers.live.keyboard.steampunk.pipes
Error: More than one classes.dex files found for cyberpunk.wallpaper.live.keyboard.sci.fi
baltpeter commented 1 year ago

These are indeed some annoying edge cases:

❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='4.4.1' compileSdkVersion='30' compileSdkVersionCodename='11'
application-label:'Ecosia'
native-code: 'x86'

config.de.303.com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='' split='config.de'

extra_icu.303.com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='4.4.1' split='extra_icu' compileSdkVersion='30' compileSdkVersionCodename='11'
❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='8.44.0' compileSdkVersion='30' compileSdkVersionCodename='11'
application-label:'AliExpress'

config.armeabi_v7a.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.armeabi_v7a'
native-code: 'armeabi-v7a'

config.de.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.de'

config.xxhdpi.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.xxhdpi'

wallet.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='8.44.0' split='wallet' compileSdkVersion='30' compileSdkVersionCodename='11'

wallet.config.armeabi_v7a.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.armeabi_v7a'
native-code: 'armeabi-v7a'

wallet.config.de.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.de'

wallet.config.xxhdpi.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.xxhdpi'
baltpeter commented 1 year ago

But—given that using this approach, we have a ~0.5% error rate and even in those error cases we produce a result no worse than our previous approach, I will still go with it. Maybe we'll have a better idea in the future…

baltpeter commented 1 year ago

For the bundle formats, that isn't a problem (all of them conveniently call the main APK base.apk).

Turns out that isn't quite true, unfortunately. sigh I've seen one XAPK where the main APK was named <app ID>.apk.

EDIT: Actually, I think that's always the case. I just misread the issue.

baltpeter commented 1 year ago

Ugh. You know what? I'll just do this properly, then. So: