pixelcog / gmail-to-pdf

A Google Apps Script library for converting Gmail messages to PDFs for easy archival.
http://bit.ly/1CsBl8U
122 stars 38 forks source link

[GmailUtils.gs] Processing Certain Messages Consistently Causes A "Regular expression operation exceeded execution time limit." Exception To Be Thrown #1

Open PaulWGraham opened 9 years ago

PaulWGraham commented 9 years ago

Processing certain messages consistently causes an "InternalError: Regular expression operation exceeded execution time limit." exception to be thrown. Specifically, when processing messages using the call messageToPdf(messages,{embedAvatar:false}) certain messages reliably cause the line of code below from the _embedHtmlImages__ function to throw the exception.

The line of code throwing the exception:

// process all style attributes html = html.replace(/(<[^>]+style=)(["'])((?:(?!\2)[^]|.))\2/gi, function(m, tag, q, style)

Note:

The actual emails in question are Kickstarter Project Updates. Not all Kickstarter Project Updates trigger the exception but one that does is Project Update #49: Double Fine's MASSIVE CHALICE by Double Fine and 2 Player Productions.

mikegreiling commented 9 years ago

Interesting... I'm not sure how to get around that one other than perhaps breaking the regex down into smaller expressions without the lookahead rules. But that feels like a hack.

PaulWGraham commented 9 years ago

Well, considering this peace of code is running up against limits imposed server side I wouldn't be surprised if the solution ends up being some sort of kludge. In that spirit I tried rewriting that particular section of code to use regex.exec() instead of string.replace() hoping the run time/timing constraints would be different but it was no good.

BTW, if you would like a message to test changes with just let me know what email address I should forward the offending message to and I'll pass it along.

mikegreiling commented 9 years ago

It appears that others have been running into this issue as well. I don't have time at the moment to find a solution, but I am putting this on my back log of tasks and I have collected several example emails from reporters.

As always, if anyone else feels like giving it a go, I would be happy to accept pull requests too.

omkar9999 commented 6 years ago

@PaulWGraham Even I faced the same problem.Upon analysis, I found that its caused due to GAS execution limit of 30 secs per function execution & the gmail-to-pdf script is parsing email body with regex and rendering URI 3 times inside the single function which is time consuming . I've split it into different methods. Fix is coded in a pull request along with CC and BCC Addresses. It is not a full proof solution , but the chances of hitting this scenario are drastically reduced.

@mikegreiling Please review & accept in case of no comments.

kheniparth commented 6 years ago

Hi @mikegreiling,

I don't know how it works but would you be able to release omkar9999 's update as new version because I am getting same error and the latest version (4) of GmalUtils doesn't have his updates.

Thank you


UPDATE

I did try to use new code updates by omkar9999 by copying them to a file in my project but it still fails to convert email to pdf. Can someone please find another solution? I will post the solution if I could find it.

Thank you