hypermail-project / hypermail

Hypermail is a free (GPL) program to convert email from Unix mbox format to html.
http://www.hypermail-project.org/
GNU General Public License v2.0
147 stars 71 forks source link

"base64 decode assumes input lines are in multiples of 4 #96" #99

Closed vandys closed 1 year ago

vandys commented 1 year ago

Convert base64 implementation of stateless to stateful, so you can feed in successive lines and get back the original binary. The previous stateless implementation assumes that the input has no residual after processing a single line, which is not a semantic guaranteed by the base64-in-email standards (and, in fact, was found while processing exports from Tutanota).

jkbzh commented 1 year ago

@vandys I'm reviewing your code, you don't need to update anything.

According to man, bzero has been deprecated in favor of memset, so I reverted that specific change and removed the strings.h include.

Seeing that base64 is so common in messages and headers and that hypermail is not processing message parts or headers in parallel, I wonder if it's worth it to allocate and free the state structure each time we get a new part and if it wouldn't be better to just use a variable and fixed memory instead of a pointer and allocated memory.

What do you think?

vandys commented 1 year ago

Yes, bcopy/bzero are bad old habits from my old BSD days! If you keep a single state, I'd still recommend being sure to reset it at the points where its use starts and ends. I'm sure at some point a malformed base64 will try to bleed from one attachment to another. OTOH, it's hard to imagine that the average processing of a base64 attachment not entirely swamping the small overhead of allocation and free. Either seems fine to me!

jkbzh commented 1 year ago

@vandys Thank you for your excellent patch. All tests I wrote for it worked well.

I edited it a bit (function and state structure names, fields, ...) and manually merged it to the 3.0 branch. https://github.com/hypermail-project/hypermail/commit/2bd8ed57721849554b054e8948d11722add3c7a0

I did a small fix. The place where you were freeing the state structure only worked in some cases. I added that release code in the two places where it should have gone.

Other than those small changes, I didn't need to change anything.