reddr / LibScout

LibScout: Third-party library detector for Java/Android apps
Apache License 2.0
283 stars 48 forks source link
android detection library-api security semantic-versioning static-analysis third-party-library

LibScout

LibScout is a light-weight and effective static analysis tool to detect third-party libraries in Android/Java apps. The detection is resilient against common bytecode obfuscation techniques such as identifier renaming or code-based obfuscations such as reflection-based API hiding or control-flow randomization. Further, LibScout is capable of pinpointing exact library versions including versions that contain severe bugs or security issues.

LibScout requires the original library SDKs (compiled .jar/.aar files) to extract library profiles that can be used for detection on Android apps. Pre-generated library profiles are hosted at the repository LibScout-Profiles.

Unique detection features:

Over time LibScout has been extended to perform additional analyses both on library SDKs and detected libraries in apps:

In addition, there is an Android Studio extension up2dep that integrates the API compatibility information into the IDE to help developers keeping their dependencies up-to-date (and more).

Library History Scraper (./scripts)

The scripts directory contains a library-scraper python script to automatically download original library SDKs including complete version histories from Maven Central, JCenter and custom mvn repositories. The original library SDKs can be used to generate profiles and to conduct library API compatibility analyses (see modules below). Use the library-profile-generator script to conveniently generate profiles at scale.

The scrapers need to be configured with a json config that includes metadata of the libraries to be fetched (name, repo, groupid, artefactid). The scripts/library-specs directory contains config files to retrieve over 100 libraries from maven central and a config to download Amazon libraries and Android libraries from Google's maven repository (350 libraries, including support, gms, ktx, jetpack, ..).

NEW (07/30/19): Added list of 45 ad/tracking libraries with currently 1182 versions (trackers.json).

Detecting (vulnerable) library versions

Ready-to-use library profiles and library meta-data can be found in the repository LibScout-Profiles.

LibScout has builtin functionality to report library versions with the following security vulnerabilities.
The pre-generated profiles for vulnerable versions are tagged with [SECURITY], patches with [SECURITY-FIX].
This information is encoded in the library.xml files that have been used to generate the profiles. We try to update the list/profiles whenever we encounter new security issues. If you can share information, please let us know.

Library Version(s) Fix Version Vulnerability Link
Airpush < 8.1 > 8.1 Unsanitized default WebView settings Link
Apache CC 3.2.1 / 4.0 3.2.2 / 4.1 Deserialization vulnerability Link
Dropbox 1.5.4 - 1.6.1 1.6.2 DroppedIn vulnerability Link
Facebook 3.15 3.16 Account hijacking vulnerability Link
MoPub < 4.4.0 4.4.0 Unsanitized default WebView settings Link
OkHttp 2.1 - 2.7.4
3.0.0- 3.1.2
2.7.5
3.2.0
Certificate pinning bypass Link
Plexus Archiver < 3.6.0 3.6.0 Zip Slip vulnerability Link
SuperSonic < 6.3.5 6.3.5 Unsafe functionality exposure via JS Link
Vungle < 3.3.0 3.3.0 MitM attack vulnerability Link
ZeroTurnaround < 1.13 1.13 Zip Slip vulnerability Link

Identified Issues

On our last scan of free apps on Google Play (05/25/2017), LibScout detected >20k apps still containing one of these vulnerable lib versions. The findings have been reported to Google's ASI program. Unfortunately, the report seemed to be ignored. In consequence, we manually notified many app developers.

Among others, McAfee published a Security Advisory for one of their apps.

LibScout 101

Library Profiling (-o profile)

This module generates unique library fingerprints from original lib SDKs (.jar and .aar files supported). These profiles can subsequently be used for testing whether the respective library versions are included in apps. Each library file additionally requires a library.xml that contains meta data (e.g. name, version,..). A template can be found in the assets directory. For your convenience, you can use the library scraper (./scripts) to download full library histories from Maven repositories. By default, LibScout generates hashtree-based profiles with Package and Class information (omitting methods).

java -jar LibScout.jar -o profile [-a android_sdk_jar] -x path_to_library_xml path_to_library_file

Library Detection (-o match)

Detects libraries in apps using pre-generated profiles. Optionally, LibScout also conducts an API usage analysis for detected libraries, i.e. which library APIs are used by the app or by other libraries (-u switch).
Analysis results can be written in different formats.

  1. the JSON format (-j switch), creates subfolders in the specified directory following the app package, i.e. *com.foo* will create *com/foo* subfolders. This is useful when coping with a large number of apps. For detailed information about the information stored, please refer to the JSON output specification.
  2. the serialization option (-s switch) writes stat files per app to disk (deprecated)
java -jar LibScout.jar -o match -p path_to_profiles [-a android_sdk_jar] [-u] [-j json_dir] [-m] [-d log_dir] path_to_app(s)  

Library API compatibility analysis (-o lib_api_analysis)

Analyzes changes in the documented (public) API sets of library versions.
The analysis results currently include the following information:

Compliance to Semantic Versioning (SemVer), i.e. whether the change in the version string between consecutive versions (expected SemVer) matches the changes in the respective public API sets (actual SemVer). Results further include statistics about changes in API sets (additions/removals/modifcations). For removed APIs, LibScout additionally tries to infer alternative APIs (based on different features).

For the analysis, you have to provide a path to the original library SDKs. LibScout recursively searches for library jars|aars (leaf directories are expected to have at most one jar|aar file and one library.xml file). For your convenience use the library scraper. Analysis results are written to disk in JSON format (-j switch).

java -jar LibScout.jar -o lib_api_analysis [-a android_sdk_jar] [-j json_dir] path_to_lib_sdks

Library Updatability analysis (-o updatability)

This mode is an extension to the match mode. It first detects library versions in the provided apps and conducts a library usage analysis (-u is implied). In addition, it requires library API compat data (via the -l switch) as generated in the lib_api_analysis mode . Based on the lib API usage in the app and the compat info, LibScout determines the highest version that is still compatible to the set of used lib APIs.
Note: The new implementation still lacks some features, e.g. the results are currently logged but not yet written to json. See the code comments for more information.

java -jar LibScout.jar -o updatability [-a android_sdk_jar] [-j json_dir] -l lib_api_data_dir path_to_app(s)

Scientific Publications

For technical details and large-scale evaluation results, please refer to our publications:

If you use LibScout in a scientific publication, we would appreciate citations using these Bibtex entries: [bib-ccs16] [bib-ccs17]