danzuep / MailKitSimplified

Send and receive emails easily, fluently, with one line of code for each operation.
MIT License
79 stars 10 forks source link

Arithmetic Operation Overflow #32

Closed faisalbwn closed 1 year ago

faisalbwn commented 1 year ago

Dear Daniel,

I am facing a arithmetic overflow issue while fetching the message summaries. Following are the details:

Exception: System.OverflowException: 'Arithmetic operation resulted in an overflow.'

Code: // Build uuid range to fetch UniqueIdRange? range = new UniqueIdRange(new UniqueId((uint)startUID), UniqueId.MaxValue);

// Serach inbox for messages uid IList? messageSummaries = await _mailFolderReader.GetMessageSummariesAsync(range, MessageSummaryItems.UniqueId | MessageSummaryItems.GMailThreadId | MessageSummaryItems.InternalDate | MessageSummaryItems.PreviewText | MessageSummaryItems.Envelope);

danzuep commented 1 year ago

I think this line was causing the issue, but I'm not sure why: var ascendingIds = uniqueIds.OrderBy(u => u.Id).ToList();.

faisalbwn commented 1 year ago

Dear Daniel, Thanks for the prompt response. Can you kindly fix it in next release.

danzuep commented 1 year ago

Please check the latest preview release. I haven't had time to test it but I figured you'd probably prefer to try something than nothing, let me know how it goes.

danzuep commented 1 year ago

UniqueIdRange.ToList() was the cause of the error, post an issue on the MailKit page.

danzuep commented 1 year ago

I've submitted a bug report here: https://github.com/jstedfast/MailKit/issues/1631

danzuep commented 1 year ago

Please check the latest pre-release still works for you.

danzuep commented 1 year ago

The memory overflow exception is to be expected. The UniqueId data type uses 8 bytes (64 bits) of memory, so memory usage for your list up to UniqueId.MaxValue = 8 * 4,294,967,295 ≈ 34 GB.

danzuep commented 1 year ago

To avoid using 34GB of memory at a time, it's better to use batch processing which can easily be done like so:

int count;
var metadata = MessageSummaryItems.UniqueId | MessageSummaryItems.GMailThreadId | MessageSummaryItems.InternalDate | MessageSummaryItems.PreviewText | MessageSummaryItems.Envelope;
do
{
    var messageSummaries = await _imapReceiver.ReadMail.Take(250, continuous: true)
        .GetMessageSummariesAsync(metadata, cancellationToken);
    count = messageSummaries.Count;
    // Process messages here
}
while (count > 0);

Even better, use the folder monitor:

var metadata = MessageSummaryItems.UniqueId | MessageSummaryItems.GMailThreadId | MessageSummaryItems.InternalDate | MessageSummaryItems.PreviewText | MessageSummaryItems.Envelope;
await imapReceiver.MonitorFolder
    .SetMessageSummaryItems(metadata)
    .SetIgnoreExistingMailOnConnect(false)
    .OnMessageArrival(OnArrivalAsync)
    .IdleAsync(cancellationToken);

See the MailKitSimplified.Receiver wiki for more information.

faisalbwn commented 1 year ago

Dear Daniel,

Thanks for suggestion. Actually my implantation is different, i have the lower bound UID, my logic is to start fetching email from the last fetched email UID. In the above example it is fetching the 250 items batches from first email UID.

Can you guide i can use the following example and also pass the starting email UID?

int count; var metadata = MessageSummaryItems.UniqueId | MessageSummaryItems.GMailThreadId | MessageSummaryItems.InternalDate | MessageSummaryItems.PreviewText | MessageSummaryItems.Envelope; do { var messageSummaries = await _imapReceiver.ReadMail.Take(250, continuous: true) .GetMessageSummariesAsync(metadata, cancellationToken); count = messageSummaries.Count; // Process messages here } while (count > 0);

danzuep commented 1 year ago

Here's how to use batch processing with a start offset:

var reader = await _imapReceiver.ReadMail
    .Skip(startUid).Take(250, continuous: true);
IList<IMessageSummaries> messageSummaries;
do
{
    messageSummaries = await reader
        .GetMessageSummariesAsync(metadata, cancellationToken);
    // Process messages here
}
while (messageSummaries.Count > 0)
faisalbwn commented 1 year ago

Hi Daniel,

I implemented the following logic:

`// Get batch size uint batchSize = _config.GetValue("ServiceConfig:EmailFetching:BatchSize");

// Get total messages count long? startUID = await _db.SW_EmailMessage.MaxAsync(sm => (long?)sm.EmailUID); startUID = startUID ?? 1;

// Init summaries meta and count int messageSummariesCount; var metadata = MessageSummaryItems.UniqueId | MessageSummaryItems.GMailThreadId | MessageSummaryItems.InternalDate | MessageSummaryItems.PreviewText | MessageSummaryItems.Envelope;

do { // Fetch message summaries within given range IList? messageSummaries = await _mailFolderReader .Skip((uint)startUID) .Take(batchSize, continuous: true) .GetMessageSummariesAsync(metadata, cancellationToken);

// Assign message count
messageSummariesCount = messageSummaries.Count;

// Proceed if message summaries been fetched
if (messageSummariesCount > 0)
{                       
    // Assign last fetched uid as start for next batch
    startUID = messageSummaries.Max(m => m.UniqueId.Id);

    // Iterate and process message summaries
    foreach (IMessageSummary messageSummary in messageSummaries)
    {
        // Create message summary
        await ProcessEmailMessage(messageSummary, false, cancellationToken);
    }
}

} while (messageSummariesCount > 0);`


But on the last batch it is throwing following exception:

System.ArgumentOutOfRangeException: 'Non-negative number required. (Parameter 'capacity')'

danzuep commented 1 year ago

If you roll back to rc2 it should work. Also start from index 0 not 1 if it's null.

var reader = await _imapReceiver.ReadMail
    .Skip(startUID).Take(batchSize, continuous: true);
IList<IMessageSummaries> messageSummaries;
do
{
    messageSummaries = await reader
        .GetMessageSummariesAsync(metadata, cancellationToken);
    foreach (var messageSummary in messageSummaries)
    {
        await ProcessEmailMessage(messageSummary, false, cancellationToken);
    }
}
while (messageSummaries.Count > 0)
danzuep commented 1 year ago

Actually MailKit only allows ints for IMailFolder. I'll have a look.

danzuep commented 1 year ago

MailKit IMailFolder.Count is an int, Linq Skip and Linq Take are also both ints, so for now do something like this:

IList<IMessageSummary> messageSummaries;
do
{
    uint endUID = startUID + batchSize - 1;
    var range = new UniqueIdRange(new UniqueId(startUID), new UniqueId(endUID));
    messageSummaries = await _mailFolderReader.GetMessageSummariesAsync(range, metadata, cancellationToken);
    foreach (var messageSummary in messageSummaries)
    {
        await ProcessEmailMessage(messageSummary, false, cancellationToken);
    }
    startUID += batchSize;
}
while (messageSummaries.Count > 0)
faisalbwn commented 1 year ago

Thanks for assistance Daniel. The above code is working fine for me.

danzuep commented 1 year ago

I've deployed a new version, please take note of the new usage below and log an issue if you find one.

var reader = _imapReceiver.ReadMail.Range(UniqueId.MinValue, batchSize);
IList<IMessageSummary> messageSummaries;
do
{
    messageSummaries = await reader.GetMessageSummariesAsync(filter, cancellationToken);
    foreach (var messageSummary in messageSummaries)
    {
        await ProcessMessages(messageSummary, cancellationToken);
    }
}
while (messageSummaries.Count > 0);
faisalbwn commented 1 year ago

Thank Daniel,

I tested the batching method and it is working as expected. But suddenly GetMessageSummariesAsync method start taking more than 1 minute, it is may be due to GMAIL connection throttling, Even though i am not fetching lot of emails too frequently, also there is not "Throttling" information in ImapClient log.