python / cpython

The Python programming language
https://www.python.org
Other
61.91k stars 29.78k forks source link

mimetools module privacy leak #36882

Closed 91e69f45-91d9-4b12-87db-a02908296c81 closed 22 years ago

91e69f45-91d9-4b12-87db-a02908296c81 commented 22 years ago
BPO 580495
Nosy @warsaw

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/warsaw' closed_at = created_at = labels = ['library'] title = 'mimetools module privacy leak' updated_at = user = 'https://bugs.python.org/phr' ``` bugs.python.org fields: ```python activity = actor = 'phr' assignee = 'barry' closed = True closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'phr' dependencies = [] files = [] hgrepos = [] issue_num = 580495 keywords = [] message_count = 3.0 messages = ['11538', '11539', '11540'] nosy_count = 2.0 nosy_names = ['barry', 'phr'] pr_nums = [] priority = 'normal' resolution = 'wont fix' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue580495' versions = [] ```

91e69f45-91d9-4b12-87db-a02908296c81 commented 22 years ago

The mimetools "choose_boundary" function according to its doc returns a string of the form 'hostipaddr.uid.pid.timestamp.random'. If this separator is actually used in a message, it reveals the host ID and UID of the sender. This is a privacy breach similar to the discovery that Microsoft Word files contained user GUID's revealing the user's PC's ethernet card's MAC address (since fixed, after the story was published on the front page of the New York Times about 2 years ago). Some info is at

http://www.junkbusters.com/microsoft.html#advisory

The fix for choose_boundary is to make the boundary string completely random and not have it reveal personal information about the user.

warsaw commented 22 years ago

Logged In: YES user_id=12800

Is this a serious concern for most applications? In most email messages, some identifying information will always leak so since it takes work to anonymize messages anyway, an application with these concerns can simply implement its own choose_boundary() algorithm, or lop off the hostid part of the generated one.

Besides, mimetools.py should be considered obsolete, in favor of the email package. When it generates a boundary it doesn't include any identifying information (but has a moderately higher possibility of collision in the source text).

91e69f45-91d9-4b12-87db-a02908296c81 commented 22 years ago

Logged In: YES user_id=72053

On the occasions where the leak matters, the consequences can be serious.

Think of an AOL user with a screen name that she uses for work-related email, and a separate screen name she uses to post to a mail list for sufferers of sexually transmitted diseases. If she sends a file attachment to a co-worker from the work screen name, and a different attachment to the STD list from the personal screen name, and her mail client uses mimetools.py, a co-worker looking at the STD mailing list's web archive can see that both attachments came from the same person.

Former US Navy Senior Chief Petty Officer Tim McVeigh (not related to the OKC bomber with the same name) had his Navy career destroyed over something sort of like this (he had an anonymous AOL profile revealing that he was gay, and the Navy connected it to him). Although McVeigh stayed out of jail because a Federal judge ruled that the Navy had violated the DoD "don't ask, don't tell, don't pursue" policy by contacting AOL to find his identity, if he had used mimetools.py to send file attachments like the hypothetical person above, the Navy might have gotten the two MIME separators without having to specially contact anyone, and so McVeigh could possibly be in the slammer now.

Anyway, if mimetools.py is deprecated, the manual should be updated to say so. It wouldn't have occurred to me to not use it if I wanted to send a MIME message. The docs should also mention this privacy leak. But I think it's better to just fix it.