privacy-tech-lab / gpc-android

Code and dynamic analysis scripts for GPC on Android
https://privacytechlab.org/
MIT License
5 stars 1 forks source link

Does the AdId deletion work? #56

Closed n-aggarwal closed 1 year ago

n-aggarwal commented 1 year ago

Deleting AdId means that developers cannot query that value anymore; however, there is a possibility that the apps store the AdId in storage and still continue using it. To check whether this happens or not, I will download 10 apps from different categories, apply apk-mitm on them and then observe the traffic.

The process will be as follows: Initially. I will leave the AdId enabled and browse through the apps. Then after sufficient time has been spent, I will delete the Adid and then continue using the app, observing the network data using PCAPdroid to see if the app still passes the Adid.

Given below is the list of 10 apps I plan to install for this experiment:

Potential Apps to download: App Name Category Number of Users
Idle Miner Tycoon Game 100 Million +
NewsBreak News 50 Million +
Spotify (been using from before) Music 1 Billion +
Picsart AI Photo Editor Photo 1 Billion +
Instagram (been using from before) Social 1 Billion +
Amazon Shopping Shopping 500 Million +
Yahoo Sports Sports 10 Million +
The Weather Channel Weather 100 Million
Tubi Entertainment 100 Million +

If you have any suggestions let me know.

kasnder commented 1 year ago

I would make sure to 'force quit' the app after deleting the AdId. Otherwise, this looks good!

n-aggarwal commented 1 year ago

Unfortunately I wasn't able to use the list of apps given above for this experiment because of various issues regarding installing the apps including the files being xapks, and mitm-proxy errors. So here is a list of updated apps that I now have:

App Name Category Number of Users
Cut the Rope Game 100 Million +
NewsBreak News 50 Million +
iHeartRadio Music 50 Milllion +
PowerDirector -Video Editor Photo 100 Million +
Amazon Shopping Shopping 500 Million +
Fox Sports Sports 10 Million +
The Weather Channel Weather 100 Million +
Twitch Entertainment 100 Million +

I wasn't able to find an app in the social category that would work. Most of them threw errors when running apk-mitm. The only one I was successfully able to complete apk-mitm on was Facebook lite, but even after I downloaded I was unable to access the network data. One possible reason for this from what I read could be certificate pinning. Is there any easy way around that ?

n-aggarwal commented 1 year ago

I have the AdId turned on as of right now and I can see the network data. But there are a Lot of different packets being sent around to and from the app. As of right now I am unable to see where and when the app is transmitting the AdId. I have attached screenshot of what it looks like on my phone:

frontpage All App Activity One of the Network Conversations Another one of the Network Conversations

The network requests shown above are from the time I started the app till I watched through an Ad (I also clicked on it). I am unsure of which network requests I need to look at and where exactly the AdId would be in the requests.

kasnder commented 1 year ago

This is a great start! I'm glad you've been able to look into some request. @n-aggarwal

As far as I recall, there's an option in that app to export network traffic. This could then be analysed on a computer.

Alternatively, it's probably good to set up a running version of mitmproxy. There are plenty of tutorials online to get you started on this one. However, make sure you don't use the integrated proxy setting of Android since apps can choose to ignore it.

For example, you can use my TC Slim app for that purpose (but not the full app since that blocks tracking), and set a SOCKS5 proxy in its settings (and likewise configure mitmproxy to run on a SOCKS5 proxy). There do exist more solid apps out there, too.

n-aggarwal commented 1 year ago

I was able to get the PCAPdroid data into Wireshark and decrypt it.

To get the data into Wireshark, I used the command: curl -NLs http://172.21.77.100:8080 | wireshark -k -i - where 172.21.77.100:8080 is the server/port provided by PCAPdroid. Additionally, I have to be on the same local network as the device running PCAPdroid.

This should enable you to see the packets being captured In Wireshark. Once you stop the capture, PCAPdroid provides me with a sslkeylog.txt file that I can use to decrypt the data. To do so first find a packages with TLS encryption, right click on it > protocol preferences > Transport Layer Security > (Pre)-Master-Secret log filename: and then choose the sslkeylog.txt file.

Finally, to only see the http requests we can filter the packets in Wireshark by the following command: http or http2. This gives me all the http (decrypted) packages that our app sent/recieved.

Now the problem has reduced to learning to use Wireshark. I will try and get familiar with it over the next week.

n-aggarwal commented 1 year ago

I was able to get the data to Application level in Wireshark and then analyze it. There are a couple of steps involved to do this:

  1. Enable Pcapdroid Trailer in Pcap settings and use the "HTTP Sever" to dump the traffic.
  2. Download the pcapdroid.lua plugin for wireshark.
  3. Create a directory for custom plugins in Wireshark. open Wireshark > About Wireshark > Folders > Personal Lua Plugins. Double click on the Personal Lua Plugins and it will ask you if you want to create a new folder; yes, and then navigate to it (double clicking it again opens the folder).
  4. Install the plugin into that directory
  5. Analyze > Reload Lua Plugins or restart Wireshark to activate the installed plugin.
  6. import the pcapng file and decrypt it using the sslkeylog.txt file as described above.
  7. Now filter to see only http or http2 packets
  8. Now if you look at one of the packets it will contain a Pcapdroid Section. Expand that section and there will be a App Name field. Drag this to the top of wireshark and make it a new column.
  9. Now all the packets will have an app associated with them.
  10. To perform AdId Analysis now, go Edit > Find Packet change display filter to string and search for the AdId.

Here are a few screenshots of what it looks like:

Screenshot 2023-05-22 at 10 15 21 PM Screenshot 2023-05-22 at 9 56 55 PM
kasnder commented 1 year ago

Cool stuff. Fantastic work!

SebastianZimmeck commented 1 year ago

@n-aggarwal will provide a summary of what he observes. This question is going in the direction of where we think our paper will go. But we may need to open more specific issues and work in more detail. So, @n-aggarwal will close this one as a preliminary exploration.

n-aggarwal commented 1 year ago

Today, I captured the Network data of the apps with the AdId enabled. Most of the results were as expected but there were a few interesting things. I have attached screenshots below to share my findings:

Key:

For reference my adid is: 4a5c227e-24b6-45fc-8efb-c780ce34af24

Cut the Rope

Screenshot 2023-06-02 at 11 51 43 PM


Nothing special going on here.

Fox Sports

Screenshot 2023-06-02 at 11 57 22 PM

Now that I am looking at this a second time, I see that the app did fingerprint my device. This can be seen in the third header field from the top.

Amazon Shopping

Screenshot 2023-06-03 at 12 00 35 AM

This one is very interesting in my opinion. My adid is tagged as "idfa" which is the IOS name for adid, but then there is an additional "adid". This "adid" might be another Advertising identifier! Additionally, in the middle of both of those, there is also an "appid". I am not sure what this is used for, but this seems to be another unique identifier.

Newsbreak

Screenshot 2023-06-03 at 12 07 23 AM

Again nothing particularly interesting going on here.

Power director

Screenshot 2023-06-03 at 12 11 04 AM

This app doesn't use the AdId at all! In fact, if you look carefully it doesn't even use internet a lot; less than 1% of the packets were http or http2 compared to 15% to 25% for most other apps! Probably not a good choice to test.

iHeart Radio

Screenshot 2023-06-03 at 12 14 42 AM

This one is rather interesting. Along with the adId (sessioId) there is a profileId and a deviceId. These may be being used for tracking purposes.

Twitch

Screenshot 2023-06-03 at 12 20 04 AM Screenshot 2023-06-03 at 12 21 41 AM

Twitch is also interesting! It doesn't use the adid but has a bunch of unique identifiers. Additionally it also has a sessionId; it probably isn't something to worry about, but some apps store the adid as sessionId, so I started wondering if Twitch used the sessionId for Tracking.

The Weather Channel

Screenshot 2023-06-03 at 12 25 01 AM Screenshot 2023-06-03 at 12 26 07 AM

This app uses the adId and also collects information about my device that my be used to uniquely identify me.

Next Steps

This is a preliminary overview of what was going on. I will take a look at the data again once tomorrow to see if I missed anything. I will then delete my AdId and see what happens.

n-aggarwal commented 1 year ago

I have now deleted my adId and ran a new PCAPdroid capture for 6 of the apps:

  1. Cut the Rope
  2. Fox Sports
  3. The Weather Channel
  4. Newsbreak
  5. Amazon Shopping
  6. iHeart Radio

NOTE: I haven't force quit the apps yet.

I didn't capture data for Twitch or PowerDirector because even before I deleted the AdId, they weren't using it.

Cut the Rope

Screenshot 2023-06-05 at 4 16 44 PM

The AdId is now zeroed out as expected. Nothing too interesting going on here.

Fox Sports

Screenshot 2023-06-05 at 4 50 11 PM

The AdId is still being sent!! Interestingly enough, it is being stored as a cookie somehow; I am not sure how that works.

The Weather Channel

Screenshot 2023-06-05 at 7 26 51 PM

Again, the AdId is still being sent, and once again, it is being stored as a cookie somehow.

Newsbreak

Screenshot 2023-06-05 at 6 28 13 PM

Again, the adId is being sent!

Amazon Shopping

Screenshot 2023-06-05 at 4 34 04 PM

Amazon does a good job! Not only does it zero out the "idfa" but it also stops using the "adid"!

iHeart Radio

Screenshot 2023-06-05 at 7 09 52 PM

iHeartRadio does zero out the adId but notice there are a bunch of other id's that are still being sent! Although, we can't know for sure, it is very possible that one of these is being used for ad tracking purposes.

Conclusions

It seems that even after the AdId is deleted, the apps are still storing it locally as a cookie, and using it! This makes me wonder what would happen I reset the Adid and then delete It? Would that be better for user privacy? Furthermore, is this even legal by the Google developer agreement?

n-aggarwal commented 1 year ago

Usage of Android Advertising ID Google Play Services version 4.0 introduced new APIs and an ID for use by advertising and analytics providers. Terms for the use of this ID are below.

  • Usage. The Android advertising identifier (AAID) must only be used for advertising and user analytics. The status of the “Opt out of Interest-based Advertising” or “Opt out of Ads Personalization” setting must be verified on each access of the ID. Association with personally-identifiable information or other identifiers.
  • Advertising use: The advertising identifier may not be connected to persistent device Identifiers (for example: SSAID, MAC address, IMEI, etc.) for any advertising purpose. The advertising identifier may only be connected to personally-identifiable information with the explicit consent of the user.
  • Analytics use: The advertising identifier may not be connected to personally-identifiable information or associated with any persistent device identifier (for example: SSAID, MAC address, IMEI, etc.) for any analytics purpose. Please read the User Data policyfor additional guidelines on persistent device identifiers. Respecting users' selections.
  • If reset, a new advertising identifier must not be connected to a previous advertising identifier or data derived from a previous advertising identifier without the explicit consent of the user. You must abide by a user’s “Opt out of Interest-based Advertising” or “Opt out of Ads Personalization” setting. >If a user has enabled this setting, you may not use the advertising identifier for creating user profiles for advertising purposes or for targeting users with personalized advertising. Allowed activities include contextual advertising, frequency capping, conversion tracking, reporting and security and fraud detection.
  • On newer devices, when a user deletes the Android advertising identifier, the identifier will be removed. Any attempts to access the identifier will receive a string of zeros. A device without an advertising identifier must not be connected to data linked to or derived from a previous advertising identifier. Transparency to users. The collection and use of the advertising identifier and commitment to these terms must be disclosed to users in a legally adequate privacy notification. To learn more about our privacy standards, please review our User Data policy. Abiding by the terms of use. The advertising identifier may only be used in accordance with the Google Play Developer Program Policy, including by any party that you may share it with in the course of your business. All apps uploaded or published to Google Play must use the advertising ID (when available on a device) in lieu of any other device identifiers for any advertising purposes.

Reference: https://support.google.com/googleplay/android-developer/answer/9857753

To me it seems, that these apps are violating at least one of the above conditions:

n-aggarwal commented 1 year ago

So, next I tried to force quit Newsbreak, Fox Sports, and The Weather Channel apps (the ones that were still using the adId), and after I did, the AdId was not in the capture for the Weather Channel, and Fox Sports, but unfortunately was still present in Newsbreak.

I will now "Get new Advertising ID" and see what happens with the Newsbreak app. EDIT- My new AdId is: 8046632e-8a16-4bda-9122-4ef48561f530

n-aggarwal commented 1 year ago

So Newsbreak does something interesting. It in fact turns out that it stopped using my Adid for ad purposes as soon as deleted it, but it continues using that string for other purposes. Here is a screenshot of the capture after I got the new AdID:

Screenshot 2023-06-05 at 11 07 44 PM

As it can been above, my previous ID is still being used for something, but now the new AdID is also there as "aaid". So, now if I go to the previous capture where I had just deleted the AdID (no force quit), it can be seen the field "aaid" is zeroed out.

Screenshot 2023-06-05 at 11 09 03 PM

As such, it seems that the old adId is not being used for Ad Tracking purposes but for something else, although we can not be sure of this.

SebastianZimmeck commented 1 year ago

Great findings, @n-aggarwal!

n-aggarwal commented 1 year ago

The preliminary exploration is now complete; we have an idea of the sort of data we might see, and the changes that may or may not happen from deleting the adId. As such, I am closing this issue. If anything new comes up, I can reopen it if needed.