Open danrubenstein opened 5 years ago
Thanks for spotting this! It seems like making it configurable could be some work. Perhaps an alternative would be to make a new function, something like parseHeadersLimit(int)
?
Btw, recently, there has been some work made in to creating a better headers parser (and mime scanner) that works directly on the stream, instead of using the textproto lib. The only reason why there was a limit was because textproto uses another buffer, and wanted to avoid re-buffering... Most of it done, should post that soon...
Oh cool! I didn't know about the multiple buffer thing, that sounds like ultimately the better solution here.
In the interim, I got around this by basically ripping off the Header Parse Processor and envelope's ParseHeaders()
method, but I think that it would be great if this were configurable at the package level.
Let me think for a bit about if this can be configurable in the processor without breaking any public contracts. I think maybe with an optional configuration method there on the processor, and an additional public method on mail.Envelope
, this would be possible, but I'd have to get into the guts a bit.
Thanks for the reply.
A suggestion: The Processors have a configuration feature. For a simple example, see in p_debugger.go, where Svc.AddInitializer
is used to set a function that extracts two new settings from the config, in combination with the debuggerConfig
struct that defines the two new settings that are in the json file.
Next, adjust the parseHeadersLimit
function to add the limit int
argument, and change HeadersParser
processor to use the new function. Also be sure to use a default value if none is present in the config (the config setting should be optional, using the omitempty
json hint).
Actually, this is what probably should have been done in the first place...
Btw, what is your suggested default value?
Hi there :wave: ,
what's new with parseHeadersLimit(int)
? I would like to be able to tweak the headers size limit. Maybe I could propose an implementation ?
parseHeadersLimit
sets the maximum amount of bytes that will be scanned &
buffered to determine if the message contains an email header. The headers
will be parsed if a header is detected (first sequence of \n\n). A limit is
set just in case someone misbehaves and sends a long message without a
header.
with the introduction of the new stream based backend, the
parseHeadersLimit
func will get deprecated. Instead, the message headers
will be scanned while the stream of bytes comes in, in a single pass. Here
you can track progress the PR
https://github.com/flashmob/go-guerrilla/pull/135
On Sat, 19 Oct 2019 at 03:00, Mathieu Tortuyaux notifications@github.com wrote:
Hi there 👋 ,
what's new with parseHeadersLimit(int) ? I would like to be able to tweak the headers size limit. Maybe I could propose an implementation ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flashmob/go-guerrilla/issues/146?email_source=notifications&email_token=AAE6MP4VAPLT4BZTNML7GPDQPHMRLA5CNFSM4HDYFD72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBU6GZA#issuecomment-543810404, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE6MP3YGMVX7HHSFJTIGA3QPHMRLANCNFSM4HDYFD7Q .
While the stream-based headers parser does seem like it will be a huge improvement it might be worth aligning the header buffer to match the specified 30kb for the time being. I had issues with messages from Microsoft having more than 4kb of headers and thus failing to parse. Of course, if the stream-based headers parser is around the corner it doesn't make sense, but the 4kb vs 30kb mismatch mentioned here was first mentioned over a year ago. I can attest that limiting the buffer size to 4kb does cause issues with popular email services.
In the envelope method to parse headers, the function comment says:
And the associated comment a few lines later seems to suggest that a header can be up to 30kb.
However, the defined constant at the top of the same file is set to 4KB.
My question is for clarification -- I am right to understand that there is some confusion between these two allocations?
If there is, one possible solution is that it could be configurable (via the ParseHeaders processor), although I'd have to think about how that could be implemented without breaking the public contract of
e.ParseHeaders
.Thanks!