Closed wareid closed 2 years ago
The obfuscation section contains no requirements on reading systems.
Reading system requirements are specified in the Reading Systems specification. In this case, they are required to reverse the process to deobfuscate: https://w3c.github.io/epub-specs/epub33/rs/#sec-container-res-obfus
The obfuscation section contains no requirements on reading systems.
Reading system requirements are specified in the Reading Systems specification. In this case, they are required to reverse the process to deobfuscate: https://w3c.github.io/epub-specs/epub33/rs/#sec-container-res-obfus
Yes, I did eventually figure that out! Thanks. I'm not clear on whether the user agent/reading system is supposed to not provide the de-obfuscated file to the user, or if that's just a requirement that comes externally from the vendor or the DRM provider and not part of the spec.
Why does the creation of the obfuscation key based on the SHA-1 hash function include a SHOULD requirement rather than a MUST? This relaxation seems primarily to decrease interoperability.
This looks like a bad porting of the original algorithm specified in: http://idpf.org/epub/20/spec/FontManglingSpec_2.0.1_draft.htm
That document is contradictory on this point (sigh), but it says in the Obfuscation Algorithm section:
The key for the algorithm must be a 20 byte (160 bit) SHA-1 digest[3] of the publication's unique identifier.
The "should" comes later but it doesn't make sense how it couldn't also be a must.
It appears when it was integrated in the original 3.0 revision that the "must" was dropped (it's no longer limited to 20 bytes) and the later should retained. But I agree that makes no sense since if it you can't know how to create the key, you can't know how to deobfuscate.
Assuming we keep the section, it definitely needs correcting.
I'm not clear on whether the user agent/reading system is supposed to not provide the de-obfuscated file to the user
The reading system will deobfuscate and use the font, but I'd assume most apps, at least, do that in memory and don't write it out to disk, if that's what you're asking. (I don't write reading systems, so that's just my understanding based on how other resources have been stated to be handled; maybe someone else will correct me.) The user typically isn't going to have any way from within the reading system to access the deobfuscated source regardless of how it's done, though, just as they can't access any other resources.
If the reading system is running in a browser, my guess would be you might be able to access the font (but maybe only as a blob url?). I don't think preventing access completely is realistic in this situation.
But obfuscation was always meant to be trivial, so the requirements have been pretty thin. It's tacitly understood that if the user wants at the font, and they have access to the epub file, they'll be able to get it.
The issue was discussed in a meeting on 2021-10-26
The issue was discussed in a meeting on 2021-10-28
If the obfuscation is trivial, then I'm not sure what value it's providing. If it's there for historical or backcompat only, it could be deprecated and warnings put in place to minimize any future harm.
Perhaps the purpose (as was mentioned at the joint group call) is to make it easier to sue or threaten criminal consequences for anyone who writes a simple script to de-obfuscate or any vendor that implements a Reading System that happens to save the file to disk. If so, that seems inconsistent with ethical Web principles.
Also, if the purpose is to enhance legal threats, we should probably document that risk somewhere: I don't want someone getting sued because they implemented -- or wrote tests for! -- the Reading System specification.
If the obfuscation is trivial, then I'm not sure what value it's providing
Trivial is still going to block most non-technical self publishers from being able to take the font and drop it in their own book, for example. (I'd hope professional publishers would know better.) It provides a measure of defence for the font vendor.
I don't think it matters much legally whether you took an unobfuscated font directly or you figured out how to reverse the obfuscation, but IANAL. You're violating the font licensing agreement by reusing it without paying for it.
Whether the user agent assumes any risk by allowing access to the unobfuscated version isn't something I can answer, either. The theft isn't in deobfuscating but in reusing without a license, so I would expect not.
There's been some discussion about the origins of this and whether it's still needed in the group's email list starting here: https://lists.w3.org/Archives/Public/public-epub-wg/2021Oct/0025.html
As I noted in the meeting minutes above, I think it's still "needed" because most (commercial) EPUBs are exported from InDesign and Adobe cares a lot about font copyright. I would guess that Adobe would not agree to removing obfuscation, and practically speaking we would need them to.
I don't think it matters much legally whether you took an unobfuscated font directly or you figured out how to reverse the obfuscation, but IANAL. You're violating the font licensing agreement by reusing it without paying for it.
Whether the user agent assumes any risk by allowing access to the unobfuscated version isn't something I can answer, either. The theft isn't in deobfuscating but in reusing without a license, so I would expect not.
This isn't theft, but potential copyright infringement, to be clear. Obfuscation doesn't only make it a little more difficult for someone to copy a font into another publication that they would sell without permission (a clear case of copyright infringement), but also often breaks epub files when they're edited on a user's device.
I am also not a lawyer, but anti-circumvention provisions in the DMCA and other laws around the world do make it particularly risky to produce (or distribute or market) de-obfuscation tools, even if you never use it or intend it for copyright infringement.
More background here: https://en.wikipedia.org/wiki/Anti-circumvention
There's been some discussion about the origins of this and whether it's still needed in the group's email list starting here: https://lists.w3.org/Archives/Public/public-epub-wg/2021Oct/0025.html
This is super useful context, thank you! It also recommends a clear way forward, that WOFF or some subsetting proposals could make obfuscation (and the legal risks of de-obfuscation) unnecessary.
Also, if obfuscation is only ever used for font files, that would be a useful limitation to note. Many of the effects (for privacy, accessibility, etc.) would be less severe if the only obfuscated files are ones that don't include contents of the text, active scripts or references to external resources.
Also, if obfuscation is only ever used for font files, that would be a useful limitation to note. Many of the effects (for privacy, accessibility, etc.) would be less severe if the only obfuscated files are ones that don't include contents of the text, active scripts or references to external resources.
We will propose to restrict obfuscation to only fonts, and we can enforce this via EPUBCheck. We can't remove obfuscation entirely as the feature is widely used. Adobe InDesign does this. Forbidding this would break thousands and thousands of existing books.
The situation is quite different in Japan. Japanese fonts have strict license conditions, so in many cases they are not embedded in EPUB. There are few cases where EPUB is output from Adobe InDesign. MORISAWA (DTP vendor and font vendor) is the only tool for embedding fonts in EPUB. However, in order to use the font embedded there in a reader that reads with a web browser, it is necessary to deobfuscate it and put it as a file, and since it is unknown whether it is allowed or not, our(Voyager's) RS does not use embedded fonts.
We can't remove obfuscation entirely as the feature is widely used. Adobe InDesign does this. Forbidding this would break thousands and thousands of existing books.
I'm not against above Dave's comments. Just explained the situation at Japan.
I would certainly recommend requiring limiting this obfuscation technique to only where it's already being used.
Can it also be marked as a deprecated technique, with clear alternatives (to WOFF or something else) to move to something better? If it's known that this feature is generally bad for users and authors and reading systems but is included for backwards compatibility, then we should be able to note it as deprecated and provide better methods going forward.
Can it also be marked as a deprecated technique
We could add a caution note that obfuscation should be avoided, but our hands are kind of tied when it comes to formally deprecating practices that have adoption. Deprecating leads to warnings in validation, which leads to content being rejected by vendors, which leads to angry publishers. It's formally in our charter that we not deprecate features that are relied on by publishers.
Agreed with Matt, but can someone clarify for me (FYI, Nick, I am pretty new here) whether we think that anyone is using obfuscation for anything other than fonts? Do RSes support obfuscation for anything other than fonts? I have not seen it anywhere other than fonts, but I am pretty new.
I would be in favor of saying "obfuscation is used for fonts, but you could and maybe should (?) use WOFF etc instead, and although obfuscation could theoretically be used for other resources, in practice no reading systems support it for anything other than fonts" ... or whatever is actually true.
In short, no need to formally deprecate, but we should document the practical state of the world and encourage WOFF for people who can use it.
Agreed with Matt, but can someone clarify for me (FYI, Nick, I am pretty new here) whether we think that anyone is using obfuscation for anything other than fonts? Do RSes support obfuscation for anything other than fonts? I have not seen it anywhere other than fonts, but I am pretty new.
I am not aware of usage outside of fonts. I think we should forbid usage outside of fonts.
+1 to Dave - this used to be called "Font Obfuscation", which pretty clearly tied it to fonts. I think (though am not certain) that loosening this to other content types was an oversight, not an intentional feature.
On Thu, Nov 11, 2021 at 9:09 AM Dave Cramer @.***> wrote:
Agreed with Matt, but can someone clarify for me (FYI, Nick, I am pretty new here) whether we think that anyone is using obfuscation for anything other than fonts? Do RSes support obfuscation for anything other than fonts? I have not seen it anywhere other than fonts, but I am pretty new.
I am not aware of usage outside of fonts. I think we should forbid usage outside of fonts.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3c/epub-specs/issues/1873#issuecomment-966471630, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA246ZGCIPRM555X5DNEJMTULP2D3ANCNFSM5GYIBJXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
It's formally in our charter that we not deprecate features that are relied on by publishers.
I didn't realize this! I suppose it depends whether privacy or compatibility issues qualify as "serious issues (such as a security bug)".
It would be useful for future requests for reviews if you could let the reviewers know whether the charter prohibits making any changes to address issues the reviewers might raise.
I think (though am not certain) that loosening this to other content types was an oversight, not an intentional feature
I think it was something in-between. I remember us discussing the change, but I can't find much about why. It appears it was done in 3.0.1 at the same time we defined the compression order, as I did find this in some old minutes:
Obfuscation: Adobe is moving to option “B” The text to be drafted will be less font-specific and tied instead to identification in encryption.xml
I'm pretty sure it wasn't done to enable obfuscation for a specific other case, though. I believe it was only because there was nothing in the section that required it to be used for fonts, so we were only making the section reflect that it could be used for other things.
It would be useful for future requests for reviews if you could let the reviewers know whether the charter prohibits making any changes to address issues the reviewers might raise.
Ya, sorry, we've just come to accept this limitation. We tried some radical changes to EPUB in the 3.1 revision, and then had to undo a lot of the work in 3.2 when publishers balked at implementing the specification. That's how it ended up in our charter.
We'd probably have to reduce the use of obfuscation to near zero before we could deprecate, otherwise a similar cycle will play out where the specification is ignored, or certainly that part.
A caution note could say that we intend to deprecate the feature in the future, which would at least give the community fair warning to look at the alternatives.
The other option would be to look at making a note out of obfuscation, encryption.xml, and rights.xml. Obfuscation began life as a note in IDPF, after all. It wouldn't change anything as far as publishers being able to implement obfuscation and drm, but perhaps helps avoid enshrining details in a standard.
The issue was discussed in a meeting on 2021-11-11
List of resolutions:
To follow up from the meeting last night, I dug into epubcheck and there is a list of pattern matches for fonts that covers the CMTs for remote fonts:
public static boolean isFontType(String type)
{
return type.startsWith("font/")
|| type.startsWith("application/font-")
|| type.equals("application/vnd.ms-opentype");
}
There's a similar check for EPUB 2, so it's probably safe to assume that using the CMT list as a basis for restricting obfuscation will probably cover the vast majority of what's out there. If other font formats are in use, then I'd imagine those folks aren't bothering with epubcheck and whatever restrictions we place here aren't going to matter to them anyway.
I believe the original issues here have been covered as fully as we can:
@npdoty is it o.k. to close this issue now?
I think it would help to explain the harms of the font obfuscation technique, in addition to the pointers to better alternatives. (Obfuscation breaks compatibility and interoperability of EPUB files, creates opacity for end users inspecting the files they're reading and introduces complexity and potential legal liability for reading system developers.)
We might also include a warning (in the RS spec) to reading system developers of the potential legal threats if they provide de-obfuscation or access to fonts.
I wonder if we should better explain the limitations of font obfuscation on the authoring side so that it fully removes any expectation that reading systems have any obligation to keep the obfuscated font secure.
The key sentence in the introduction is this:
The hope is that this algorithm will meet the requirements of most vendors who require some assurance that their fonts cannot simply be extracted by unzipping the Container.
The only expectation is that it will help prevent trivial copying out of one container and into another, but this may be something we take for granted. Perhaps we can list ways that obfuscation does not protect the content from copying to better remove any misunderstanding (e.g., that users may be able to gain access to the unobfuscated font through their reading system).
There shouldn't be a threat to reading system developers from using obfuscated fonts. The primary point of concern is between the author and the font vendor -- namely, that the vendor agrees that obfuscation is sufficient protection if that vendor isn't the one protecting the resource.
My understanding is that the DMCA has been used as a legal threat against those distributing open source software that allows for de-obfuscation and saving of font files, and that that threat could also be levied against any reading system that saves the de-obfuscated font file.
Going back to @toshiakikoike's comment, should deobfuscation support be a recommendation and not a requirement? If there are already reading systems ignoring obfuscated fonts, it would be contradictory to compel reading systems to support deobfuscation. You must deobfuscate the font even if you don't use it?
My understanding is that the DMCA has been used as a legal threat against those distributing open source software that allows for de-obfuscation and saving of font files, and that that threat could also be levied against any reading system that saves the de-obfuscated font file.
Obfuscation in EPUB has been around since 2008 or so. I'm not aware of litigation around this, or threats of litigation.
By "saving the de-obfuscated file" do you mean making the font easily available to the end user in its original form?
My understanding is that the DMCA has been used as a legal threat against those distributing open source software that allows for de-obfuscation and saving of font files, and that that threat could also be levied against any reading system that saves the de-obfuscated font file.
Obfuscation in EPUB has been around since 2008 or so. I'm not aware of litigation around this, or threats of litigation.
By "saving the de-obfuscated file" do you mean making the font easily available to the end user in its original form?
Implementations of both the IDPF and Adobe font (de)obfuscation methods have been available "in the open" for some time now. I'm not sure about the legal implications, but practically-speaking font de-obfuscation seems to be a relatively simply hurdle to bypass. A few examples:
I agree that the obfuscation algorithm is not challenging for a programmer to bypass and that the code is publicly available (as is the algorithm). I believe the DMCA doesn't require protections to be especially strong for it to be illegal to provide circumvention tools.
My understanding is that the DMCA has been used as a legal threat against those distributing open source software that allows for de-obfuscation and saving of font files, and that that threat could also be levied against any reading system that saves the de-obfuscated font file.
I should be more precise here. I don't know for certain that a particular DMCA complaint has been filed, I've just observed someone posting a link to a github repo for a tool that does de-obfuscation and then the link being broken / code not being available. (My recollection was that there was a reference to DMCA or to a legal issue, but if there was, I don't have a link handy any more.)
By "saving the de-obfuscated file" do you mean making the font easily available to the end user in its original form?
Yes, that's what I mean.
I should be more precise here. I don't know for certain that a particular DMCA complaint has been filed, I've just observed someone posting a link to a github repo for a tool that does de-obfuscation and then the link being broken / code not being available. (My recollection was that there was a reference to DMCA or to a legal issue, but if there was, I don't have a link handy any more.)
There's a very recent case where a GitHub repo that had code to completely remove DRM from an ebook was taken down via a DMCA notice.
The issue was discussed in a meeting on 2022-02-03
List of resolutions:
@npdoty, in view of the recent updates on the spec (#1980) is it o.k. to close this issue? Thx.
The issue was discussed in a meeting on 2022-04-08
List of resolutions:
From the PING review: