Closed mnmkng closed 2 years ago
I guess this one is still valid and should be prioritized given we want to enable fingerprints by default? Or something changed? cc @petrpatek
Yes, it is. Maybe you guys can do it the same as in got-scraping?
Here's the cache: https://github.com/apify/browser-pool/blob/308c4ef0a7615b5ffdb4a019bd195774cb78ee59/src/fingerprinting/hooks.ts#L25
Since the fingerprint should be per-page, I think we need to merge createFingerprintPreLaunchHook
into createPrePageCreateHook
first.
Then, we should create a WeakMap here. The logic needs to be replaced with something like this:
const defaultToken = {}; // WeakMap doesn't accept Symbols yet
...
const token = pageOptions.sessionToken ?? defaultToken;
if (!(weakCache.has(token))) {
weakCache.set(token, fingerprintGenerator.getFingerprint...);
}
const fingerprint = weakCache.get(token);
I don't think there's a better way to pass the sessionToken
other than via pageOptions
, however I'm open for other ideas, two heads better than one :) We would need to update the pageOptions
typings accordingly.
A Session
needs to be passed to sessionToken
.
This sounds like a rather simple fix for me (I'm probably missing something). createFingerprintPreLaunchHook
works with launchContext
, which - when enhanced - contains current Session
-> let's just use session.id
as fingerprintCache
key?
This fixes @mnmkng 's dynamic proxy server problem and lets the user manage the fingerprint usage via sessionPoolOptions
.
...what is that I am missing? :)
I'm not sure if it's the best solution. Right now, when I use the default generic Apify proxy URL, it will always give me the same fingerprint, but the IPs will auto-rotate. It also gives the user no control over over the use of the fingerprints. I like the
got-scraping
sessionToken
solution better and I think we should do it this way, before we make fingerprints default in the SDK.