Closed henbos closed 1 year ago
@pes10k @alvestrand @vr000m @xingri @Diego-Perez-Botero
MC is currently up to the UA which in practise appears to be implemented as "always expose"
@drkron please comment if this was a fair characterization of MediaCapabilities HW exposure checks or if we do in fact take more steps to prevent leaking HW.
MC is currently up to the UA which in practise appears to be implemented as "always expose"
@drkron please comment if this was a fair characterization of MediaCapabilities HW exposure checks or if we do in fact take more steps to prevent leaking HW.
That seems like a fair characterization. From what I remember, the powerEfficient bit alone has not been considered to have privacy implications. The browser may in certain cases choose to set powerEfficient=true even though there's no HW support (e.g., low resolutions may for example be considered to be powerEfficient without HW support).
See https://w3c.github.io/webrtc-nv-use-cases/#game-streaming for the description of the use-case.
Note that attacks requiring a live connection like heuristics about decode time can be used as Diego pointed out. So far, we haven't seen such attacks though so I do not think we are gaining much by gating decoderFallback.
For cloud gaming the gamepad API could be taken into account as well but I assume this is the same single-digit percentage solution as microphone.
We should still take fullscreen (removed in #712) into account since it will prevent some fingerprint but maybe need different levels of granularity?
Moving forward I'd rather go the opposite direction and have more information in encoderImplementation/decoderImplementation if getUserMedia et al give a strong signal.
For cloud gaming the gamepad API could be taken into account as well but I assume this is the same single-digit percentage solution as microphone.
That was suggested before and it would improve the situation, but I feel like it has the same problem as the GUM check, it restricts to certain devices... but you could have cloud gaming with a keyboard. It would seem rather arbitrary if you can have good HW usage if you plug in a controller but not if you use a keyboard.
IsFullscreen is also problematic
I think going with a variation to option 2 in the original thread makes sense, i.e., if the mediacapabilities reveals the hw capabilities, we can piggyback that information
Suggestion: If the MediaCapabilities already shows a defined value of PowerEfficient, then all the fingerprint value of PowerEfficientEncoder / Decoder is already out there. The PowerEfficientEncoder is only adding information about later events on the system, which is not that useful for fingerprinting. (The check should be against what MC would return if asked, not about whether or not it has been exposed.)
So instead of guarding this with HardwareAllowed, guard it with MediaCapabilities.PowerEfficient.
Makes sense to just reference MC.powerEfficient, the intent should be clear
Hi all, I don't know how overlapping the repo's/specs are so i wanted to point to my comment in the webrtc-stats spec that covers the same concerns as this issue
Am proposing to expose this info in WebCodecs via an event: https://github.com/w3c/webcodecs/pull/645
I see there are two parallel but related conversations going on. I've been mostly commenting on the PING/privacy review side of things at https://github.com/w3c/webrtc-stats/pull/732, but if it'd be helpful im also happy to discuss here too / instead
I see there are two parallel but related conversations going on. I've been mostly commenting on the PING/privacy review side of things at #732, but if it'd be helpful im also happy to discuss here too / instead
@pes10k Thank you for the response. I agree this might be more proper place to discuss the alternatives. Regarding the option 1, we had a discussion on https://github.com/w3c/webrtc-stats/pull/725 previously.
thank you for the link @xingri. Just to make sure i understand correctly, option1 then would leave powerEfficientDecoder
as is (i.e. hardware check to access), but would add another property called decoderFallback
that would not need hardware check?
if decoderFallback
is (always or almost always) just the opposite of powerEfficientDecoder
, im not sure thats a privacy improvement.
I think the approach in the spec of "group the fingerprinting relevant inputs together and require some high-touch event before the page can load them" is probably the right way to go. if cloud gaming folks need a way to get access to these stats that isn't getUserMedia()
related, figuring out how to define that high-touch event might be the best way forward (for my 2c)
thank you for the link @xingri. Just to make sure i understand correctly, option1 then would leave
powerEfficientDecoder
as is (i.e. hardware check to access), but would add another property calleddecoderFallback
that would not need hardware check?
That is correct.
if
decoderFallback
is (always or almost always) just the opposite ofpowerEfficientDecoder
, im not sure thats a privacy improvement.
Basically, I agree with you that it can be considered equally. However, based on my understanding on the Finger Printing protection, it is to protect the UserAgent identification. If that is correct, decoder fallback couldn't be used to identify the User Agent since the decoder fallback is only introduced from possible abnormality of the system and can be revoked when the system recovered. I do not think this can be used a finger printing because of that characteristics. So it gives advantage compare to the power efficiency metric.
I think the approach in the spec of "group the fingerprinting relevant inputs together and require some high-touch event before the page can load them" is probably the right way to go. if cloud gaming folks need a way to get access to these stats that isn't
getUserMedia()
related, figuring out how to define that high-touch event might be the best way forward (for my 2c)
Will investigate it.
@pes10k Could you please review my feedback on the decoderFallback approach from the FP protection perspective once again?
Also could you share a little more detail about how to define the "high-touch event"? I have been looking for the information but I couldn't find the exact details so far.
@pes10k Could you please review my feedback on the decoderFallback approach from the FP protection perspective once again?
Ah i see, can you say more about the kinds of cases that would cause decoderFallback
to be true? I thought decoderFallback
just meant "software renderer", which could be true for all sorts of persistant reasons (the device doesn't have relevant hardware, or the OS has been upgraded/ changed and so the hardware isn't supported, etc). Could you give some other examples of when decoderFallback
would be true, and which would be temporary vs semi-persistant?
Also could you share a little more detail about how to define the "high-touch event"?
Sorry, I didn't mean to mislead you, this isn't a well defined term. I just mean "some high level of interaction or user intent". The most common examples are dialogs and permission prompts, but those aren't the only ones
Ah i see, can you say more about the kinds of cases that would cause
decoderFallback
to be true? I thoughtdecoderFallback
just meant "software renderer", which could be true for all sorts of persistant reasons (the device doesn't have relevant hardware, or the OS has been upgraded/ changed and so the hardware isn't supported, etc). Could you give some other examples of whendecoderFallback
would be true, and which would be temporary vs semi-persistant?
@pes10k Thanks you for the response. To share the transitions of this metric, it may look like following:
FYI, in case of the Chromium browser case, the fallback gets triggered by the following reasons:
enum class RTCVideoDecoderFallbackReason {
kSpatialLayers = 0,
kConsecutivePendingBufferOverflow = 1,
kReinitializationFailed = 2,
kPreviousErrorOnDecode = 3,
kPreviousErrorOnRegisterCallback = 4,
kConsecutivePendingBufferOverflowDuringInit = 5,
kMaxValue = kConsecutivePendingBufferOverflowDuringInit,
};
@pes10k Could you share your opinion on the decoderFallback metric approach? Can we move forward option 1 from the FP perspective?
@pes10k It is a gentle reminder that I am waiting for your feedback. Please let me know if you need more clarifications on the decoderFallback metric.
@xingri Im not familiar enough with this part of the code to know what all of those enum options mean, but if this means that decoderFallback
would only depend on things that happened on the current site (such as the decoder failing previously during the session) then I think thats a fine, A+ approach.
If though the decoderFallback
value is shared across sites or, or across browser sessions, then i think the fingerprinting risk is possibly still here.
If its the second case, then i'll have to read up more on the kinds of things that could cause decoderFallback
to be true
and get back to you (which i could do by Monday at the latest)
@pes10k Thank you so much for the feedback. Fortunatley, the decoderFallback is only effective on the current site during the session and not shared throughout the browser session.
@henbos By considering the feedback from @pes10k , could we move forward with the option 1? FYI, I have rebased the change on the https://github.com/w3c/webrtc-stats/pull/725.
If though the decoderFallback value is shared across sites or, or across browser sessions, then i think the fingerprinting risk is possibly still here.
The decoderFallback only reflects the current site's decoder usage and as such does not expose if other websites are experiencing fallback, only if the current site experiences fallback.
However, because one of the reasons that your decoder could fallback is that the HW is not available, if another website or application acquired the same HW before you did, this could result in decoderFallback when you try to instantiate your decoder (depending on HW limitations in number of concurrent instances).
How concerning is that to you @pes10k ?
However, because one of the reasons that your decoder could fallback is that the HW is not available, if another website or application acquired the same HW before you did, this could result in decoderFallback when you try to instantiate your decoder (depending on HW limitations in number of concurrent instances).
@henbos Thank you for the feedback, Based on my proposal, decoder fallback only triggerred when the decoder starts with hardware acceleration. Unless the session starts with hardware decoding, the decoder fallback will be not triggered.
Clever!
This issue was discussed in WebRTC Feb 2023 Meeting ā (Issue #730 šļø)
Based on WG meetings where the desire to expose an event instead has been discussed, this is no longer being considered by the stats spec
After some back and forth, the current HW exposure check only considers contexts that have capture (getUserMedia). This obviously does not work for Cloud Gaming.
What's even more frustrating is that MediaCapabilities already expose the HW capability information, but it does not expose what is currently used (e.g. powerEfficientDecoder or decoderFallback). The specs are inconsistent in how much they care about exposure information. The MC privacy consideration section says that there is very little information exposed so it is not a big problem, but that a user agent might want to throttle or give vague answers, which is very much up for interpretation. Why would getStats() be less restrictive than MC?
Possible paths forward:
In reality though, because MC is currently up to the UA which in practise appears to be implemented as "always expose", doing 2) would basically turn off all finger printing protection from getStats(). But at least the two specs would be consistent and we would have a placeholder algorithm (in MC spec) to update when necessary.
Let's discuss