deanmalmgren / textract

extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k stars 599 forks source link

unsupported operand type(s) for +: 'NoneType' and 'bytes' #342

Open tanguy-a opened 4 years ago

tanguy-a commented 4 years ago

Describe the bug Error raised by the lib textract in /textract/parsers/msg_parser.py l.27 when the "m.subject" is empty.

def extract(self, filename, **kwargs): m = extract_msg.Message(filename) return ensure_bytes(m.subject) + six.b('\n\n') + ensure_bytes(m.body)

When returning it tries to concatenate an empty string (subject)

To Reproduce Steps to reproduce the behavior: Use process on a .msg file which doesn't have a subject.

Expected behavior I'd still like to get the text processed by textract even tho the subject is empty.

Desktop (please complete the following information):

Thank you for your help.