apple / password-manager-resources

A place for creators and users of password managers to collaborate on resources to make password management better.
MIT License
4.14k stars 449 forks source link

Using Google's PasswordRequirementsSpec API #427

Open m33x opened 3 years ago

m33x commented 3 years ago

Google's PasswordRequirementsSpec API:

In June 2018, Google started to develop a (in theory privacy protecting) password requirements specification API for Chrome's built-in password generator.

Essentially, the API returns something similar to the Password Rules (password-rules.json) quirk from this project.

As of February 2021, the API includes the password requirements for 237 websites (Feb. 2023: 246) (two example can be found at the end of this post).

The API returns the data as Protocol Buffers (protobuf).

Below you can find a step-by-step guide how to parse the response of the API with Python 3.

Step 1 - Create a Directory

$> mkdir GoogleAPI
$> cd GoogleAPI

Step 2 - Query the API

The API URL can be found here.

We download all entries via:

$> curl https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000 -o 0000.pb

Note: This part will change in the future. Once Google implements/activates their privacy protection, it is likely that this need to be revised. Read more here.

Step 3 - Download protoc Compiler

We download and install the binary protoc from GitHub.

On macOS 13 Ventura it looks similar to this (you will need to adjust this to your OS):

$> curl -L https://github.com/protocolbuffers/protobuf/releases/download/v22.0/protoc-22.0-osx-x86_64.zip -o protoc-22.0-osx-x86_64.zip
$> unzip protoc-22.0-osx-x86_64.zip
$> mv bin/protoc protoc
$> rm -rf bin include readme.txt protoc-22.0-osx-x86_64.zip

Step 4 - Download .proto Schema Files

We download the .proto files which defines the protocol format of the 0000.pb file that we downloaded from the API in Step 2.

Download password_requirements.proto and password_requirements_shard.proto
$> curl -s "https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/autofill/core/browser/proto/password_requirements.proto?format=TEXT" | base64 --decode > password_requirements.proto
$> curl -s "https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/autofill/core/browser/proto/password_requirements_shard.proto?format=TEXT" | base64 --decode > password_requirements_shard.proto

Step 5 - Compile Schema Files

Next, we compile the downloaded schema files so we can use them in Python 3.

Compile password_requirements.proto to password_requirements_pb2.py and password_requirements_shard.proto to password_requirements_shard_pb2.py
$> ./protoc password_requirements.proto --python_out ./ --proto_path ./
$> ./protoc password_requirements_shard.proto --python_out ./ --proto_path ./

Step 6 - Prepare Python 3

We need to install protobuf for Python 3.

$> python3 -m venv venv3
$> source ./venv3/bin/activate
(venv3) $> pip install protobuf

Step 7 - Parse the Response

Finally, we can parse the content via with a small script called parse.py that looks similar to this:

# We import password_requirements_shard_pb2.py
import password_requirements_shard_pb2

# We instantiate the password requirements shard class
shard = password_requirements_shard_pb2.PasswordRequirementsShard()

# We read and parse the downloaded protobuf file
try:
    f = open('0000.pb', "rb")
    shard.ParseFromString(f.read())
    f.close()
except IOError:
    print("Could not find file. Wrong directory or filename?")

# We iterate over the entries and display some data
for entry in shard.specs:
    print('Entry with domain: "{}"'.format(entry))
    print("Data for this entry:")
    print(shard.specs[entry])
    print("############")

You need to run it like this:

$> (venv3) python3 parse.py

The now appearing entries look similar to this:

############
Entry with domain: 'equifax.com'
Data for this entry:
priority: 20
symbols {
  character_set: "!@$*+-"
  min: 1
  max: 15
}

############
Entry with domain: 'bankofthewest.com'
Data for this entry:
priority: 20
symbols {
  character_set: "!@#$%"
  min: 1
  max: 15
}

...

Step 8 - Convert PasswordRequirementsSpec to Password Rules Language (Missing)

Next, we will we need someone who is able to write a converter from Google's PasswordRequirementsSpec to Apple's Password Rules Language. Anyone?

peterstory commented 1 year ago

Any thoughts on how to retrieve password rules for more than just the ~200 or so websites included in the 0000.pb shard? After reading the documentation, my understanding is that there are many shards, and that a shard containing password data for a particular domain can be identified based on the MD5 of that domain: https://chromium.googlesource.com/chromium/src/+/refs/heads/master/components/password_manager/core/browser/generation/password_requirements_spec_fetcher_unittest.cc#80

However, when I try to load other shards, I always get a 404. Here's an example, based on those unit tests:

curl \
  -H "Origin: https://www.example.com" \
  https://www.gstatic.com/chrome/autofill/password_generation_specs/1/5aba \
  -o 5aba.pb

Perhaps Google rate-limits access to this API, and an API key needs to be included?

m33x commented 1 year ago

Experienced the same. Imho the fastest solution is to build Chrome yourself. I know, at first it sounds terrifying, but if you do it in a VM (that you can delete afterwards) it should help you debug the problem quickly. Honestly, I think that is the fastest solution to debug what is internally happening.

peterstory commented 1 year ago

Is it still possible to use Google Sync services with a self-compiled version of Chrome? https://blog.chromium.org/2021/01/limiting-private-api-availability-in.html

If I compile my own version of Chrome, what would be the most efficient way to determine how to use the API? Maybe disabling certificate pinning, and intercepting the traffic to see what the HTTP calls look like?

m33x commented 1 year ago

No, you do not involve the network. Instead, you simply use printf or what ever they use a proxy for that (probably some logger class). Wasn't aware of the Google Sync services limitations, but I would assume they are not involved. Anyway, spending 1-2h compiling Chrome should be a worthwhile experiment going forward.

peterstory commented 1 year ago

I managed to compile Chrome, and have had success adding debugging logging. However, I also discovered that the pre-compiled version of Chrome also includes related logging. For example, running Chrome with logging enabled on the CLI, and searching for mentions of the relevant URL:

> Google\ Chrome.app/Contents/MacOS/Google\ Chrome --enable-logging=stderr --v=1 2>&1 | grep -C 3 password_generation_specs
[93695:259:0216/114541.208468:VERBOSE1:dispatcher.cc(451)] Num tracked contexts: 5
[93673:259:0216/114541.218932:VERBOSE1:password_requirements_service.cc(72)] PasswordRequirementsService::PrefetchSpec(https://www.rei.com/)
[93673:259:0216/114541.218945:VERBOSE1:password_requirements_spec_fetcher_impl.cc(98)] Fetching password requirements spec for https://www.rei.com/
[93685:12035:0216/114541.219060:VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000
[93673:259:0216/114541.219134:VERBOSE1:mutable_profile_oauth2_token_service_delegate.cc(263)] MutablePO2TS::RefreshTokenIsAvailable
[93673:259:0216/114541.219189:VERBOSE1:mutable_profile_oauth2_token_service_delegate.cc(263)] MutablePO2TS::RefreshTokenIsAvailable
[93673:259:0216/114541.219279:VERBOSE1:autofill_manager.cc(119)] Parsed forms:
--
[93673:259:0216/114621.048272:VERBOSE1:password_requirements_spec_fetcher_impl.cc(98)] Fetching password requirements spec for https://www.swagbucks.com/
[93673:259:0216/114621.048347:INFO:CONSOLE(0)] "[DOM] Input elements should have autocomplete attributes (suggested: "new-password"): (More info: https://goo.gl/9p2vKq) %o", source: https://www.swagbucks.com/ (0)
[93673:259:0216/114621.048359:INFO:CONSOLE(0)] "[DOM] Input elements should have autocomplete attributes (suggested: "new-password"): (More info: https://goo.gl/9p2vKq) %o", source: https://www.swagbucks.com/ (0)
[93685:12035:0216/114621.048489:VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000
[93673:259:0216/114621.048573:VERBOSE1:mutable_profile_oauth2_token_service_delegate.cc(263)] MutablePO2TS::RefreshTokenIsAvailable
[93673:259:0216/114621.049868:VERBOSE1:field_candidates.cc(39)] type: 9 score: 1.4
[93673:259:0216/114621.049882:VERBOSE1:field_candidates.cc(39)] type: 9 score: 1.4

This suggests that the same URL is being used for all password spec requests! I'll have to do some more digging to figure out why this is happening.

m33x commented 1 year ago

You are awesome.

This is what I feared. So yes, in theory they have a privacy preserving API, but it is currently not used at all. This also means the API only covers 200-300ish domains worldwide. Sad.

Great work! Thank you!

peterstory commented 1 year ago

The strange thing is, for a research project I've done extensive testing of Chrome on dozens of different websites, and Chrome often customizes its password suggestions to different websites' requirements. So I think there is something more going on. I wonder if Chrome is hiding some details in HTTP headers – I'll add some more logging to try figuring it out!

peterstory commented 1 year ago

I've made an interesting discovery: Chrome can get password specs from (at least) two places. As we found, the domain-level specs seem underutilized. However, specs can also come from an autofill data API.

A password generated on rue21.com includes symbols, yet the https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000 API is used, and it doesn't contain the spec. Running Chrome with logging gives some clues:

[VERBOSE1:password_requirements_service.cc(155)] PasswordGenerationRequirements parameters: 1, 0, 5000 ms

[VERBOSE1:password_requirements_service.cc(72)] PasswordRequirementsService::PrefetchSpec(https://www.rue21.com/)
[VERBOSE1:password_requirements_spec_fetcher_impl.cc(98)] Fetching password requirements spec for https://www.rue21.com/
[VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: https://content-autofill.googleapis.com/v1/pages/ChVDaHJvbWUvMTEwLjAuNTQ4MS4xMDASLAlaI8PZ-GDRlhIFDWtomm4SBQ1Pnif4EgUNg6hbPRIFDc5BTHoSBQ1z0P09?alt=proto
[INFO:CONSOLE(0)] "[DOM] Input elements should have autocomplete attributes (suggested: "username"): (More info: https://goo.gl/9p2vKq) %o", source: https://www.rue21.com/store/ (0)
[VERBOSE1:network_delegate.cc(34)] NetworkDelegate::NotifyBeforeURLRequest: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000
[VERBOSE1:form_structure.cc(527)] Autofill query response from API was successfully parsed: 

[VERBOSE1:password_requirements_service.cc(108)] PasswordRequirementsService::AddSpec(10867573997743317850, 2051817934, {priority: 10, symbols: {character_set: "!@$#.*_-?", min: 1, max: 4294967295, }, })

[VERBOSE1:password_requirements_spec_fetcher_impl.cc(257)] Found no entry for rue21.com

[VERBOSE1:password_requirements_service.cc(64)] PasswordRequirementsService::GetSpec(https://www.rue21.com/, 10867573997743317850, 2051817934) = {priority: 10, symbols: {character_set: "!@$#.*_-?", min: 1, max: 4294967295, }, }

Note the call to the https://content-autofill.googleapis.com/v1 API, and use of the AddSpec method.

Searching for calls to AddSpec reveals this method: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/password_manager/core/browser/password_generation_frame_helper.cc#58

Searching for the API URL shows how autofill data is requested: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/autofill/core/browser/autofill_download_manager.cc

This protobuf definition describes the format of the autofill data, which can include password specs: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/autofill/core/browser/proto/api_v1.proto

My intuition is that having to access the password specs through the autofill API may make scraping the data more challenging, but it definitely warrants further investigation.

m33x commented 1 year ago

I did what I suggested to do. Compiling Chrome took 4h on my spare machine, but anyway. :-) You can printf anything like this:

VLOG(1) << "- MAXTESTING - origin: " << origin;

then I started Chrome, and later checked the log file like this:

out/Default/chrome -enable-logging=stderr --v=1 2>&1 | tee /media/xubuntu/DATA/log.txt

The problem is the prefix_length, which is configured by the Chrome engineers to be always 0.

prefix_length is set to 0 here: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/password_manager/core/browser/password_requirements_service.cc#131

It has the chance to be overwritten by so-called "FieldTrialParams" here: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/password_manager/core/browser/password_requirements_service.cc#148

A new PasswordRequirementsSpecFetcherImpl is then instantiated with these parameters here: password_requirements_service.cc(155)] PasswordGenerationRequirements parameters: 1, 0, 5000 ms https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/password_manager/core/browser/password_requirements_service.cc#155

GetHashPrefix() always returns 0000 if prefix_length is 0: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/password_manager/core/browser/generation/password_requirements_spec_fetcher_impl.cc#60

password_requirements_service.cc(155)] PasswordGenerationRequirements parameters: 1, 0, 5000 ms
password_requirements_spec_fetcher_impl.cc(37)]  - MAXTESTING - prefix_length: 0
password_requirements_spec_fetcher_impl.cc(38)]  - MAXTESTING - prefix_length_: 0

password_requirements_service.cc(72)] PasswordRequirementsService::PrefetchSpec(https://www.rue21.com/)

password_requirements_spec_fetcher_impl.cc(109)] Fetching password requirements spec for https://www.rue21.com/
password_requirements_spec_fetcher_impl.cc(136)] - MAXTESTING - origin: https://www.rue21.com/
password_requirements_spec_fetcher_impl.cc(137)] - MAXTESTING - prefix_length_: 0

password_requirements_spec_fetcher_impl.cc(65)]  - MAXTESTING - domain_and_registry: rue21.com
password_requirements_spec_fetcher_impl.cc(67)]  - MAXTESTING - domain_and_registry.data(): rue21.com
password_requirements_spec_fetcher_impl.cc(68)]  - MAXTESTING - domain_and_registry.size(): 9
password_requirements_spec_fetcher_impl.cc(70)]  - MAXTESTING - origin: https://www.rue21.com/
password_requirements_spec_fetcher_impl.cc(71)]  - MAXTESTING - prefix_length: 0
password_requirements_spec_fetcher_impl.cc(72)]  - MAXTESTING - domain_and_registry.data(): rue21.com
password_requirements_spec_fetcher_impl.cc(73)]  - MAXTESTING - domain_and_registry.size(): 9

# Loop
password_requirements_spec_fetcher_impl.cc(75)]  - MAXTESTING - digest.a: P?-?̴?0x14T1@X?Y?0x10D?o0x020x7f
password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: P
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 00d52d9912ccb43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 00002d9912ccb43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: -
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 0000009912ccb43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 0000000012ccb43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: 0x12
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 0000000000ccb43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000b43f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000003f14543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000014543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: 0x14
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000543140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: T
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000003140588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: 1
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000000040588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: @
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000000000588b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: X
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000000000008b59b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000000000000059b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: Y
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 000000000000000000000000000000b6

password_requirements_spec_fetcher_impl.cc(76)]  - MAXTESTING - byte-before: ?
password_requirements_spec_fetcher_impl.cc(86)]  - MAXTESTING - byte-after: 0x00
password_requirements_spec_fetcher_impl.cc(88)]  - MAXTESTING - MD5DigestToBase16(digest)-temp: 00000000000000000000000000000000
password_requirements_spec_fetcher_impl.cc(91)]  - MAXTESTING - MD5DigestToBase16(digest): 00000000000000000000000000000000
password_requirements_spec_fetcher_impl.cc(92)]  - MAXTESTING - MD5DigestToBase16(digest).substr(): 0000

password_requirements_spec_fetcher_impl.cc(139)] - MAXTESTING - hash_prefix: 0000
network_delegate.cc(35)] NetworkDelegate::NotifyBeforeURLRequest: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000

In contrast if you set the prefix_length to 32, the hash_prefix is as intended:

password_requirements_spec_fetcher_impl.cc(141)] - MAXTESTING -  hash_prefix: 50d5
network_delegate.cc(35)] NetworkDelegate::NotifyBeforeURLRequest: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/50d5
password_requirements_spec_fetcher_impl.cc(216)] Fetch for 50d5: failed to fetch. Net Error: net::ERR_HTTP_RESPONSE_CODE_FAILURE
password_requirements_service.cc(98)] PasswordRequirementsService::OnFetchedRequirements(https://www.rue21.com/, {})

So to summarize, the privacy-preserving aspect of the API is there, but currently (on purpose) not activated. Thus, by default all websites end in the default bucket 0000, which currently (as of Feb. 2023) contains 246 domains.

peterstory commented 1 year ago

Yes, all websites get the same bucket from that API: https://www.gstatic.com/chrome/autofill/password_generation_specs/1/0000

"...but they were all of them deceived, for another [API] was made:" https://content-autofill.googleapis.com/v1/...

And that second API does return customized password generation rules for many websites. For example, for rue21.com:

{priority: 10, symbols: {character_set: "!@$#.*_-?", min: 1, max: 4294967295, }, }

A further mystery: when testing using Chromium, there were only calls to the first API. When testing using Chrome, there were calls to both APIs.