Closed baltpeter closed 1 year ago
Turns out this is more important than I thought: Without this, CA will refuse to start an app analysis on the custom bundle formats.
While I'm at it, I also wanted to add support for passing an array of split APKs to parseAppMeta()
(so we take AppPath<Platform>
). The annoying problem here is figuring out which of the APKs is the base APK.
In CA, we currently assume that we can just take any of the APKs and get the same information for all of them:
But that isn't true. The base APK has more information than the splits (including information that we do return):
❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
config.de.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.de'
config.x86.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.x86'
native-code: 'x86'
config.xxhdpi.1122090047.de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='' split='config.xxhdpi'
de.check24.reisen.apk:
package: name='de.check24.reisen' versionCode='1122090047' versionName='2022.9.0' compileSdkVersion='31' compileSdkVersionCodename='12'
application-label:'Reisen'
For the bundle formats, that isn't a problem (all of them conveniently call the main APK base.apk
). But how do I find the main APKs from an array of splits without parsing all of them?
Bing Chat suggested:
The base APK should have a
classes.dex
file that contains the app’s code, while the split APKs should only have resources or manifest file.
Let's test that. Bing Chat was also kind enough to write me this script:
#!/bin/zsh
# Check if a folder is provided as an argument
if [ -z "$1" ]; then
echo "Please provide a folder name."
exit 1
fi
# Loop through each app folder
for app in $1/*; do
# Check if the app folder is a directory
if [ -d "$app" ]; then
# Get the bundle ID from the app folder name
bundle_id=${app##*/}
# Initialize a flag to indicate if a classes.dex file is found
found=0
# Loop through each apk file in the app folder
for apk in $app/*.apk; do
# Check if the apk file contains a classes.dex file using unzip -l
unzip -l $apk classes.dex > /dev/null 2>&1
# If it does, check the flag and the apk file name
if [ $? -eq 0 ]; then
# If the flag is already set, it means there are more than one classes.dex files
if [ $found -eq 1 ]; then
echo "Error: More than one classes.dex files found for $bundle_id"
break
# If the flag is not set, check if the apk file name matches the bundle ID
else
# If it does, set the flag to 1 and continue
if [ ${apk##*/} = "$bundle_id.apk" ]; then
found=1
continue
# If it does not, it means the classes.dex file is in the wrong apk file
else
echo "Error: classes.dex file is not in $bundle_id.apk for $bundle_id"
break
fi
fi
fi
done
# After looping through all apk files, check if the flag is still 0, which means no classes.dex file is found
if [ $found -eq 0 ]; then
echo "Error: No classes.dex file found for $bundle_id"
fi
fi
done
Result of running that against 3313 apps:
Error: More than one classes.dex files found for bofamily.app
Error: More than one classes.dex files found for com.alibaba.aliexpresshd
Error: More than one classes.dex files found for com.anuntis.fotocasa
Error: classes.dex file is not in com.brave.browser.apk for com.brave.browser
Error: No classes.dex file found for com.brave.browser
Error: More than one classes.dex files found for com.cyberlink.youcammakeup
Error: More than one classes.dex files found for com.ecosia.android
Error: More than one classes.dex files found for com.eisterhues_media_2
Error: classes.dex file is not in com.imo.android.imoim.apk for com.imo.android.imoim
Error: No classes.dex file found for com.imo.android.imoim
Error: classes.dex file is not in com.imo.android.imoimhd.apk for com.imo.android.imoimhd
Error: No classes.dex file found for com.imo.android.imoimhd
Error: More than one classes.dex files found for com.kiwibrowser.browser
Error: classes.dex file is not in com.lenovo.anyshare.gps.apk for com.lenovo.anyshare.gps
Error: No classes.dex file found for com.lenovo.anyshare.gps
Error: More than one classes.dex files found for com.limebike
Error: More than one classes.dex files found for com.myfitnesspal.android
Error: classes.dex file is not in com.psa.mym.mycitroen.apk for com.psa.mym.mycitroen
Error: No classes.dex file found for com.psa.mym.mycitroen
Error: classes.dex file is not in com.psa.mym.myopel.apk for com.psa.mym.myopel
Error: No classes.dex file found for com.psa.mym.myopel
Error: classes.dex file is not in com.psa.mym.mypeugeot.apk for com.psa.mym.mypeugeot
Error: No classes.dex file found for com.psa.mym.mypeugeot
Error: More than one classes.dex files found for com.qidian.Int.reader
Error: More than one classes.dex files found for com.ubercab
Error: More than one classes.dex files found for com.viber.voip
Error: More than one classes.dex files found for com.wave.keyboard.theme.diamondanimatedkeyboard
Error: More than one classes.dex files found for com.wave.livewallpaper
Error: More than one classes.dex files found for com.zhiliaoapp.musically
Error: More than one classes.dex files found for cool.wallpapers.live.keyboard.steampunk.pipes
Error: More than one classes.dex files found for cyberpunk.wallpaper.live.keyboard.sci.fi
Error: classes.dex file is not in de.dasoertliche.android.apk for de.dasoertliche.android
Error: No classes.dex file found for de.dasoertliche.android
Error: classes.dex file is not in de.motain.iliga.apk for de.motain.iliga
Error: No classes.dex file found for de.motain.iliga
Error: classes.dex file is not in de.rewe.app.mobile.apk for de.rewe.app.mobile
Error: No classes.dex file found for de.rewe.app.mobile
Error: classes.dex file is not in sg.bigo.live.apk for sg.bigo.live
Error: No classes.dex file found for sg.bigo.live
Error: classes.dex file is not in video.like.apk for video.like
Error: No classes.dex file found for video.like
Error: classes.dex file is not in video.like.lite.apk for video.like.lite
Error: No classes.dex file found for video.like.lite
There seems to be a bug with the script as all the no classes.dex found
errors that I've looked at are false-positives.
Rather than try to debug this (I didn't see any obvious bug), I just went manually through the results (especially considering how few there were to begin with).
And indeed, there are only 17 apps that violate our assumptions, and all of them because there are multiple split APKs with classes.dex
. Here is the cleaned up list:
Error: More than one classes.dex files found for bofamily.app
Error: More than one classes.dex files found for com.alibaba.aliexpresshd
Error: More than one classes.dex files found for com.anuntis.fotocasa
Error: More than one classes.dex files found for com.cyberlink.youcammakeup
Error: More than one classes.dex files found for com.ecosia.android
Error: More than one classes.dex files found for com.eisterhues_media_2
Error: More than one classes.dex files found for com.kiwibrowser.browser
Error: More than one classes.dex files found for com.limebike
Error: More than one classes.dex files found for com.myfitnesspal.android
Error: More than one classes.dex files found for com.qidian.Int.reader
Error: More than one classes.dex files found for com.ubercab
Error: More than one classes.dex files found for com.viber.voip
Error: More than one classes.dex files found for com.wave.keyboard.theme.diamondanimatedkeyboard
Error: More than one classes.dex files found for com.wave.livewallpaper
Error: More than one classes.dex files found for com.zhiliaoapp.musically
Error: More than one classes.dex files found for cool.wallpapers.live.keyboard.steampunk.pipes
Error: More than one classes.dex files found for cyberpunk.wallpaper.live.keyboard.sci.fi
These are indeed some annoying edge cases:
❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='4.4.1' compileSdkVersion='30' compileSdkVersionCodename='11'
application-label:'Ecosia'
native-code: 'x86'
config.de.303.com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='' split='config.de'
extra_icu.303.com.ecosia.android.apk:
package: name='com.ecosia.android' versionCode='303' versionName='4.4.1' split='extra_icu' compileSdkVersion='30' compileSdkVersionCodename='11'
❯ for f in *.apk; do echo "$f:"; aapt dump badging $f | grep -E 'package|native|application-label|version'; echo "\n"; done
com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='8.44.0' compileSdkVersion='30' compileSdkVersionCodename='11'
application-label:'AliExpress'
config.armeabi_v7a.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.armeabi_v7a'
native-code: 'armeabi-v7a'
config.de.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.de'
config.xxhdpi.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='config.xxhdpi'
wallet.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='8.44.0' split='wallet' compileSdkVersion='30' compileSdkVersionCodename='11'
wallet.config.armeabi_v7a.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.armeabi_v7a'
native-code: 'armeabi-v7a'
wallet.config.de.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.de'
wallet.config.xxhdpi.418.com.alibaba.aliexpresshd.apk:
package: name='com.alibaba.aliexpresshd' versionCode='418' versionName='' split='wallet.config.xxhdpi'
But—given that using this approach, we have a ~0.5% error rate and even in those error cases we produce a result no worse than our previous approach, I will still go with it. Maybe we'll have a better idea in the future…
For the bundle formats, that isn't a problem (all of them conveniently call the main APK
base.apk
).
Turns out that isn't quite true, unfortunately. sigh I've seen one XAPK where the main APK was named <app ID>.apk
.
EDIT: Actually, I think that's always the case. I just misread the issue.
Ugh. You know what? I'll just do this properly, then. So:
split
set.base.apk
and parse that.
Not terribly important, but since we now support installing custom APK bundle formats (#64), we should also support parsing metadata for them.