SimpleMachines / SMF

Simple Machines Forum — SMF in short — is free and open-source community forum software, delivering professional grade features in a package that allows you to set up your own online community within minutes!
https://www.simplemachines.org/
Other
601 stars 255 forks source link

Security edits on .jpgs #3928

Closed sbulen closed 7 years ago

sbulen commented 7 years ago

In 2.1, I have received invalid security warnings uploading .jpgs. Someone also reported this on the support site for 2.0 the other day: http://www.simplemachines.org/community/index.php?topic=552034.0

I traced my 2.1 issue to the checkImageContents call in subs-graphics. It's finding a hit on the preg_match, even the lighter one without extensive security checks selected. The 2.0 issue reported that option was de-selected as well.

My jpgs were taken on a nikon, and edited in the latest version of lightroom.

sbulen commented 7 years ago

I've been doing some reading on this, & I think that checkImageContents may need a new set of edits. In my case, the preg_matches are returning false positives, and there are probably better ways to do this today.

A helpful read - especially the response that says all uploaded images are evil: http://stackoverflow.com/questions/6484307/how-to-check-if-an-uploaded-file-is-an-image-without-mime-type

My first thought is to change our minimal validation to use getImageSize(), and the extensive validation to use imageCreateFromString().

This one may be important, as it involves security and it appears to be impacting 2.0.

I'd appreciate feedback before diving in too deep here.

tinoest commented 7 years ago

I've tended to use the following, if its any use at all. mime_content_type is just to catch when getimagesize parses an file as a image when it shouldn't do so. You could ensure that the first imageType matches the second from getimagesize if you wanted to be really cautious.

   function check_image($filename) {

    $imageType      = mime_content_type($filename);
    if(!in_array($imageType, array('image/gif', 'image/jpeg', 'image/tiff', 'image/bmp', 'image/png'))) {
            return false;
    }               

    $imageData      = getimagesize($filename);
    if(!is_array($imageData)) {
            return false;
    }

    if( $imageData[0] == 0 ) {
            return false;
    }

    if( $imageData[1] == 0 ) {
            return false;
    }

    $imageType      = $imageData[2];     
    if(!in_array($imageType , array(IMAGETYPE_GIF , IMAGETYPE_JPEG ,IMAGETYPE_PNG , IMAGETYPE_BMP))) {
            return false;
    }

    return true;
   }
sbulen commented 7 years ago

Thanks, that's very helpful. And consistent with the other things I've read.

sbulen commented 7 years ago

There is another current thread out there in 2.0 support with the same issue: http://www.simplemachines.org/community/index.php?topic=544243.0

sbulen commented 7 years ago

More reading on sanitizing images: http://stackoverflow.com/questions/3644138/secure-user-image-upload-capabilities-in-php

Everything I've read suggests that the best solution is to make a copy of the image, in a way that will not copy all the tags, metadata, etc.

tinoest commented 7 years ago

@sbulen Can you attach a image which is failing to this issue?

I'll have a look and see if I can work out where its falling over.

sbulen commented 7 years ago

Will do. I'm out of town at the moment & don't have access to those pics, so this will take a day or two.

FYI, there are images as well as analysis from the developer nend at the 2.0 thread referenced above: http://www.simplemachines.org/community/index.php?topic=544243.msg3915301#msg3915301

Multiple parts of the preg_match are failing. I really suspect that we need to think more along the lines of the getImageSize technique described above, and possibly re-encoding where further security is needed.

sbulen commented 7 years ago

More helpful info here on intrusion detection here: https://www.trustwave.com/Resources/SpiderLabs-Blog/Hiding-Webshell-Backdoor-Code-in-Image-Files/

sbulen commented 7 years ago

I've been reading thru the routines in subs-graphics, and they appear to be doing EVERYTHING I've read as best practice already. Re-encoding when desired (an option under Admin | Attachment Settings), and checking for bad strings. It already checks getImageSize. It already renames the uploaded files. (I've always been impressed how security conscious SMF is & this just reaffirms that...)

So... I'm now wondering if it is simply checking a bit too much in checkImageContents. For example, where I've read it should check for certain binary values, they normally say only to check the header in the first 100 bytes, where checkImageContents checks the entire file. And it is common to have a valid URL in your EXIF info, so the http string check might be a bit much - especially for those who have not requested extensive checks.

I'm now thinking there are only minor tweaks needed to checkImageContents, along the lines of:

I'm going to read a bit more before proposing changes.

Input welcome.

sbulen commented 7 years ago

Some research... Each file that fails seems to fail on a different edit, even on the 'normal' edits. I can copy excerpts of the files that fail below.

The 'normal' term is: ~(iframe|(?<!cellTextIs)html|eval|body|script\W|[CF]WS[\x01-\x0C])~i

Nothing fails for: iframe, eval, script\W

(?<!cellTextIs)html: s$«5c#ªÌV%¾—¸–ƒb5ÒRE¬f¿bÖ´ ®ý`‹éÓ·üË]r˜áqy›q/¢ÖN¡wI´ž±Z‘Û?>ªÚ¬§N›DoÀájEÇLú-¬)ºyê#†’x[r·ô3sF†³%¡” ;ïý³öqx\ŒæÓL—1ÄO@abÏÖõÑeôðY3KÌ,kfMµm}æZœâ]V%©‰ Âaê$ú]È OļìhTmL;]V¥C¯g>ôÇÓWÝSoÒL˜†Ž*…MyõÑ{[Iÿ êM]ß󪏩‚u ãÿ À#sês£ñiSìjÇžçöU¦2Ò@ ègÔå.bÈâ³Z­~2¯–eÐP>÷ù—‡ºõð‰–ÞË“¬GWp¬J±L,ŠY" @°‚,Vz^Všx+ èˆq

body: HGÞA(1W¨£a‹\:£Ë'’ƒ¾¤A±ýԁ{x!DFáÕ=’¨©o“ÙD½ç•HûNÈç×;¨EÈ:#{û bOdyƒ²[ßÔ#Ì=“FÇšÞ¡Øz$dö@“9 £bÙÙÎÉ—ŽÈÞ;+¡ÿÓù×–“›Ñ]T•.ZtU°#`Víî&©ÙҐYjݽµ4mHgp™ŒvÛKodЧÊöH°uZ*ÒÚSB,#Êj¾»#jhڏ(vG”ûrŠ²šyB¸KÉ´õFÔÒíŸÉƒ V¨¤Ñ¶o%§¢<–ô FÛL7ºhÛ/<†­{'·©M ß«´ éDzӶÐ[X)¡›õfž‰~¬ÎËVÔmMdýY½RýU½–ͨÛÙ4méÙ/ÔÛÙlÛhÛI£lª7²©·²ÜÑJhÛêmì‘Ñ7²ÞZ•&†Ô›Ù¨·²ß¶ÐZšÿ Ô€èÔZº;p‘bqз²¨it¶ ±7õ¤ttöå¨9¨³ý—P5-QÊÿ g„¿ÙôºÛi0-AÈÿ g‚—û=uöR6Ò9ðô¿Ùë³±"ÀƒþÏGû=v¶¤iäðâþ z%þÏ=kfSòÂyC )³Êîyv‚Äò8¨;„~ WpÆ/„y}SÈáúƒ—wb[òxpÿ P(ýA˹å„l äðáþ¢RýEË»±XO'‡õ¿RweßòÀAŒ<?ú“‚GFîËÐyMè%½“Èóߪ;²_ª8/Cä·²F›£ÏþªîÊ'Jå輆”y

[CF]WS[\x01-\x0C] (the binary character is an ACK, displayed here as an ): ¸ T–v* ÁC¿Ü¡¢ÓIÛnQQvmEßr‡·Ø)V”@B–þ—ö¶šµ’^à7JvyLa/4¦ §TSlÀÚÍÏàáèß.:“O‰84X]È †Ø#-I1**Fws**AMSÀû€ç¥À|”7›à»ÀCØ¢ªiD&ìP5`)>@n0Q.öǺ InPÒì é¾ )I4M¤ªx i¼ ÿ ˆ¨'Ü

sbulen commented 7 years ago

@Illori - Here is one of the complete files I couldn't post on SM.org due to size. This sample file has the Fws{ACK}

I can provide more sample shots if needed. I'm starting to think the preg_match doesn't work well anymore.

(Please forgive the soft focus - handheld shot from the bleachers + the lighting was poor...)

fonzo-20170208-00142

sbulen commented 7 years ago

@tinoest - image copied above.

sbulen commented 7 years ago

One minor issue with the 'extended' search string: |\\<\\?|\\<%| is equivalent to |\\<|

The reasoning: The ? means zero or one. At the end of this expression, it basically means you don't have to have the last \ to get a hit. In other words, you're really just looking for \<. Further, since \< is a match, then \<% is redundant. This is all easily confirmed in any regex tester, as well as with real data.

\<, by the way, gets MANY hits in almost all of my .jpgs. It's just too short a string.

We're losing a statistical battle with the contents of these .jpgs...

tinoest commented 7 years ago

Thanks for the image I'm yet to get a chance to check it.

If you double escape the ? Does it solve the issue. I think \<\ \ ? Would mean it looks for a <?

Ignore the space between the \ \ markup is removing a \

Actually the current regex only finds a <? Or <% when I try it

sbulen commented 7 years ago

Sorry the markup deleted all of the double \s for me. I tried to clean it up a bit & failed. What I tried to say is: |\\<\\?|\\<%| is equivalent to |\\<| and will return hits on all instances of \<

I do not know which strings they are trying to get a hit on there. I believe it is a form of inline code.

My current thinking is that we should not even do the preg_matches if the image is getting re-encoded. I believe the re-encoding is the proper safety measure.

We're losing a statistical battle. In my mind, the image encoding is like the thousand monkeys typing away & accidentally writing Shakespeare's works. Only they just have to get a hit on something like 'body' or 'html'... In a big enough .jpg, odds are pretty good that happens.

This is also an issue with 2.0, and is causing frustration for folks with photography forums.

jdarwood007 commented 7 years ago

When the code to originally do this was added in a security release for 1.1 and 2.0, an idea was pitched that we could re-encode all images uploaded, but it was thrown out since PHP didn't have handling for all the different image types we could encounter and feasibly support with the PHP versions we had to maintain and memory requirements. Since we have PHP 5.3 now as the min, we could take a look at it again and see what we could do. Ideally I think we need to cover jpg, gif, and png. Others would be a plus, especially if we could support gifv formats.

The false positives have been known since then and improved in a later security release, but none the less we know we will still hit false positives. Re-encoding seemed like the only way out of this to clear out any bad stuff and possibly strip metadata that could contain bad stuff.

sbulen commented 7 years ago

Reading the notes from the frustrated photographers on the support site, they do not want re-encoding, and they do not want these random rejections either.

Unfortunately, they can't have it both ways... To NOT re-encode, and to NOT do the string searches leaves the forums wide open to image-based security attacks.

In my mind, after reading up on this the last two weeks, I believe the solution is to only do the preg_matches if the images are not being re-encoded. You have to pick some degree of protection. I don't think the preg_matches are necessary when the image is being re-encoded.

sbulen commented 7 years ago

@tinoest - you are correct, the regex does in fact properly catch <? and <%.

tinoest commented 7 years ago

@sbulen - The image you posted, which contains alot of meta data, would fail on many of those checks in SMF. I think that you're right in that the checks are outdated and need to be reviewed.

I would think that moving the attachments to a different directory outside of the document root is something that should be done anyway, and then have a image proxy which can only read the data and never execute it. So even if it is comprimised it would never get run.

Obviously checks should be run where applicable, maybe enabling different levels of security would be the way forward, so that the forum owner can choose how secure they want to make their site.

sbulen commented 7 years ago

These preg_matches are failing in the image encoding itself, not in the exit data. I haven't seen a failure on the exit data yet. (Because I'm not trying to stuff code in there...)

I picture the image encoding, which is basically producing random bytes of data, as the proverbial thousand monkeys at typewriters. Sooner or later, they will hit words.

A four letter word will come up pretty frequently.

I have about 1 in 5-10 of my images fail these security checks. Large, 2mb+ images are losing a statistical battle.

The problem is that this renders SMF extremely problematic for those trying to share photos. Almost useless. Hence the frustration on the support site.

sbulen commented 7 years ago

Other smf forks share this code and have the same issues. I have posted this issue elsewhere and triggered discussion and planned changes. I'll monitor that for ideas.

(As a noob here, I must say that I feel a little like a child of divorced parents...)

tinoest commented 7 years ago

@sbulen it is like that quite a bit on here sometimes.

I personally think a separate directory and a proxy is the best way, however most people probably don't have that level of control on their hosting.

So second option is re-encode the image.

Third is the preg_match checks.

Maybe the code gets changed to offer the admin the choice of those three options?

sbulen commented 7 years ago

Savvy users can already place the attachment directory wherever they want, right?

I'm having an issue with the preg_matches for the reason supplied above.

tinoest commented 7 years ago

I get that, they should be a last resort option imo as they are outdated now. But not my decision

albertlast commented 7 years ago

I don't understand why this problem existing, why is php code execute in picture?

another point from database side is very very very bad idea to use the picture function from board like smf, because they safe the data as binary in the database, what is very very bad behavior in any form.

sbulen commented 7 years ago

Albert,

Historically, this was a form of hacking a site.

The way I think of it, is that all traffic in and out of your website is basically serial in nature. Uploading a photo is basically uploading a huge string. If you put some in-line php or javascript in that string, it will in fact execute. (In a way, it is similar to sql injection...)

A malicious user can do a lot of damage that way.

This also means that historically, allowing the upload of images to any website was a MAJOR vulnerability.

So this code exists. But it prevents photographers from uploading a lot of their photos.

Most of these edits should not exist anymore.

The difficult question is how safe do we need to make it, and which vulnerabilities are in fact current and valid today.

Shawn

albertlast commented 7 years ago

And why is the "string" is execute? And why is a good idea to safe the photos in smf?

sbulen commented 7 years ago

I'm still figuring this out myself...

My understanding is that the code gets executed a few ways. First, inline when images are displayed. Second, once planted, hackers can invoke it indirectly. (Follow the links above...)

Lots of folks want to use SMF to share photos.

I learned thru another forum that most of these attack vectors are obsolete. Admins have to go out of their way to enable them. So the question is: what stays?

sbulen commented 7 years ago

All attachments are stored in the files system, not the db. The smf attachments table just keeps track of their location & basic attributes

sbulen commented 7 years ago

Here's another explanation: https://de.slideshare.net/mobile/saumilshah/hacking-with-pictures-hacklu-2014

sbulen commented 7 years ago

So... After lots of research, I recommend:

I think these changes should be reflected on the 2.0 site as well. This should make some of our photography oriented users a bit more happy.

I'll test out some changes over the next day or two & submit a PR.

This, I think, will be the short-to-midterm solution. Longer term, I will just keep doing research. If I run into a better library or technique, I'll share.

sbulen commented 7 years ago

More info on parallel thread here: https://github.com/elkarte/Elkarte/issues/2874