mxk / go-imap

IMAP4rev1 Client for Go
BSD 3-Clause "New" or "Revised" License
212 stars 63 forks source link

handle long response lines from server #7

Closed jprobinson closed 9 years ago

jprobinson commented 9 years ago

I have a scenario where a UID Search against Gmail might return a very large response. With the current implementation of transport.ReadLine(), the client errors and exits and I'm unable to get the search results.

To deal with longer lines without having to increase our buffer size, I've changed transport.ReadLine() to read the response in chunks until it encounters a non-bufio.ErrBufferFull error or an end of line.

mxk commented 9 years ago

Why don't want to increase the buffer size? That's why I exported BufferSize in the first place. How long is the actual response? Second, can you break a single command into multiple ones to reduce the response size (e.g. limit the UID range to be searched)? If the command is too long (RFC 2683 says the approximate limit is 1,000 octets), then it should be broken up anyway. If the command is short, it's still better the limit the amount of work that the server has to do.

jprobinson commented 9 years ago

I think the problem of resetting the buffer size is that we may not always be able to easily predict the length of a UID Search response. To get around this, I think I'd have to preface every UID Search with a Search call to get a count first, then make several UID Searches with ranges. Do you have a better solution? This implementation seems like it'd be very noisy especially if most scenarios would work fine with just a UID Search.

As for RFC 2683, to me the language sounds like they are talking about the length of the line a client generates. What we're dealing with here is a line the server is generating. Gmail seems to happily return a line much longer than 1,000 octets and I don't see anywhere in the RFC about the client rejecting a response from the server.

To me, simply chunking the response makes for a much nicer experience with the library but if we decide not to merge this PR, I might look into highlighting the BufferSize more in the docs and the potential 'gotchas' with response sizes. It looks like at least one other person has run into this.

mxk commented 9 years ago

Yes, the 1,000 octet recommendation is for the command. One thing you should consider is that the server may be locking the mailbox while performing the search. You don't want this operating taking a very long time and possibly interfering with other clients. Also, in network protocols in general it's not a good idea to allow unbounded responses. There should be some limit to deal with misbehaving servers and protocol errors.

The solution that you proposed is one that I've used in the past and is similar to the recommendation in RFC 2683 section 3.2.1.2. You should only issue commands that are guaranteed to result in responses of no more than X bytes. For example, get the UID of every thousandths message and then issue separate UID search commands for each range. This will ensure that even if all 1,000 messages match the search, the response will not exceed the buffer size. It also reduces the amount of work that the server has to do in one iteration and allows it to perform other tasks in between the searches. Finally, if this command is tied to some GUI, it allows you to begin showing results or issue additional commands before the entire search is completed.

This is basically in the same category as running LIST "" *. It's quick, easy, and works in most cases, but it's not recommended and will cause problems eventually. Breaking up the search is a bit more effort, but it results in much more robust software.

jprobinson commented 9 years ago

Fair enough. Thanks for your time and explanation (and this library!).