clonemeagain / plugin-fwd-rewriter

An osTicket plugin to rewrite incoming emails
GNU General Public License v2.0
9 stars 0 forks source link

Struggling with Regex Rewriter #4

Open AdamDempsey opened 6 years ago

AdamDempsey commented 6 years ago

I'm trying to use the plugin to strip out the signatures from our users and while I've got a regexp that works in testing I can't get it to work in the plugin.

I'm trying to remove a table that contains an image named "sig_image.png" in the below example

/table.*?sig_image.png.*?\/table/i:<br/><strong>**Sig Removed**</strong><br />

I've tried various permutations of formatting, eg with and without speech marks, without tag brackets etc but all with no luck.

I've tested with a basic word replacement (/dempsey/i:flempsy) and that works so plugin is enabled etc.

Any suggestions welcome :)

clonemeagain commented 6 years ago

Hmm.. You're trying to match HTML with regular expressions.

image

https://stackoverflow.com/a/1732454/276663

Having said that, people do it all the time. I notice you aren't matching the < & > of the "table" tags. Which might cause some corruption in the output.. but without some source to play with, I can't really be sure.

Also of note, the plugin uses text from the MessageBody object: (string) $val->body etc.. which may or may not be after osTicket runs this:

        // Capture a list of inline images
        $images_before = $images_after = array();
        preg_match_all('/src=("|\'|\b)cid:(\S+)\1/', $this->body, $images_before,
            PREG_PATTERN_ORDER);

So, it's possible that the image has been replaced with a cid inline-attachment code.. (Looks like mailfetch & mailparse both call that

If you're comfortable forwarding me an email that is failing to match, I'll have a look. clonemeagain@gmail.com Or, and this might be the ticket, find a ticket with that attached image, and see what cid code it get's, then match that! ;-)

AdamDempsey commented 6 years ago

The images are hosted so they weren't effected by that, turns out my regex skills were just a bit rusty and when I was testing the email body was a single line, I needed to add the s (PCRE_DOTALL) modifier then it worked.

It's now picking up too much but that's a regex issue (your image is right, already regretting starting this! ha) but plugin itself working perfectly so thanks :)

Edit: Actually the problem wasn't my regex but because it was using a colon which is your separator between $pattern and $replacement, so I just updated both class.RewriterPlugin.php and config.php to use another separator character now all is working for me, thanks again :)

clonemeagain commented 6 years ago

Oh cool, what separator worked? Or just make a pull with the change mate.

AdamDempsey commented 6 years ago

I'm using the pipe character although just looked it up and that is also reserved in regex so still needs to be something else.