Closed 6a8a08c7-8292-45e6-a476-33c00a9e4342 closed 5 years ago
When parsing an mbox file, the Python mailbox library is confused by the presence of lines starting with 'From' in the body of the text. A new fragmentary message item is created, but this is wrong. The following sample code and input demonstrates this. Replacing 'From' in the message body with, say, ' From' results in correct parsing.
This defect prevents correct import of mbox files into hyperkitty for GNU Mailman 3, as one instance where this is an impediment, as the message items become corrupt.
-- Python code import sys import mailbox
def main():
print('mailbox read test')
mbox = mailbox.mbox(sys.argv[1])
for msg in mbox:
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
print(msg)
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
if __name__ == "__main__":
main()
--- sample mbox with one message
From Fred Nurk \fred.nurks@nowhere.org\ Wed, 8 Dec 1999 14:45:02 -0400 Date: Wed, 8 Dec 1999 14:45:02 -0400 From: Fred Nurk \fred.nurk@inowhere.org\ Subject: Testing mbox in Python
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce semper tempus augue at consectetur. Morbi eu nunc magna. Nulla placerat, eros in mollis finibus, dui risus ultrices tortor, non tincidunt nibh odio at augue. Quisque quis mauris neque. Curabitur ac accumsan neque. Maecenas sed mauris non justo sagittis finibus vel vel ex. Maecenas quis rutrum libero. Curabitur ex ante, tincidunt in velit at, egestas lobortis quam. Praesent tempus at dui ut volutpat. Nullam in rhoncus massa, id malesuada tortor. Suspendisse at cursus ex. Phasellus vitae pulvinar eros. Ut euismod dapibus libero, ultricies tempor leo accumsan ac. Etiam vestibulum, urna eget interdum eleifend, nulla nulla eleifend lacus, at lacinia neque nisi non velit.
From sed vehicula venenatis dui at ultricies. Pellentesque vehicula vulputate nibh nec aliquet. Vestibulum pretium velit id libero porttitor, sed facilisis metus fermentum. Donec vestibulum, sapien non convallis sodales, justo libero volutpat dui, ut luctus odio nisi eget sapien. In viverra libero gravida arcu euismod, non sollicitudin massa auctor. Pellentesque vitae laoreet nisi. In eros massa, pretium at condimentum eu, molestie ut tortor. Suspendisse faucibus felis sem, et fringilla urna consectetur molestie. Integer suscipit, orci sed convallis maximus, velit purus tempus dui, id egestas tortor erat auctor dui. Nulla fermentum tellus ut odio elementum, vel bibendum mi imperdiet. Proin sed auctor purus. Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Nullam non arcu ex. Duis dapibus nunc in urna dapibus, sit amet interdum lectus tincidunt.
Fred
--
Not really a bug. Results from problems with the loose mbix format and lack of standards. Nothing Python can do about it.
This problem is the whole reason "mangle_from" exists in the email library...
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['3.7', 'invalid', 'type-bug', 'library', 'expert-email']
title = 'mbox From line wrongly detected'
updated_at =
user = 'https://bugs.python.org/Andro'
```
bugs.python.org fields:
```python
activity =
actor = 'r.david.murray'
assignee = 'none'
closed = True
closed_date =
closer = 'eric.smith'
components = ['Library (Lib)', 'email']
creation =
creator = 'Andro'
dependencies = []
files = []
hgrepos = []
issue_num = 37357
keywords = []
message_count = 3.0
messages = ['346192', '346201', '347591']
nosy_count = 4.0
nosy_names = ['barry', 'r.david.murray', 'maxking', 'Andro']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue37357'
versions = ['Python 3.7']
```