contao / core

Contao 3 → see contao/contao for Contao 4
GNU Lesser General Public License v3.0
490 stars 214 forks source link

Images named *.JPG aren't displayed in the frontend #4364

Closed cmyk closed 12 years ago

cmyk commented 12 years ago

Images named .JPG (not .jpg) aren't displayed in the frontend, namely when clicked upon and opened in a new window/lightbox. Users often upload images directly from the camera where they are named *.JPG.

danielritter commented 12 years ago

Kann ich bestätigen.

leofeyer commented 12 years ago

Welche Contao-Version betrifft das?

cmyk commented 12 years ago

Hab's gerade noch im 2.11.3 reproduziert.

leofeyer commented 12 years ago

I cannot reproduce the issue, though. I renamed music_academy/james-wilson.jpg to … wilson.JPG and updated the news article accordingly. I then opened the image in the mediabox and it was displayed correctly.

How exactly did you reproduce the issue?

cmyk commented 12 years ago

Maybe it has to do with case sensitive file systems? The problem occured live (redhat) and locally (OS X 10.6). Steps to repro: upload image named *.JPG and check if it's displayed in the mediabox.

leofeyer commented 12 years ago

What does your .htaccess file look like?

xchs commented 12 years ago

Da stimmt aber einiges nicht:

Nachtrag: Möglicherweise wurden aber auch nur einige Tags (<...>) hier von GitHub ausgefiltert.

cmyk commented 12 years ago

Nein, stimmt schon: http://pastebin.com/ybhpJSaW GFM ist schuld.

cmyk commented 12 years ago

Here's my .htaccess: http://pastebin.com/1Px0jCsA Note Line 228. Without that line the folderurl wouldn't work. Anyways, even if I disable rewriting URLs and the folderurl extension, the problem with JPG vs jpg persists. This is the exact same file: http://www.alpsteinsport.ch/tl_files/alpsteinsport/Fotoalben/Fussball_Freizeitschuhe/DSC01463.JPG http://www.alpsteinsport.ch/tl_files/alpsteinsport/Fotoalben/Fussball_Freizeitschuhe/DSC01463-1.jpg If I disable my .htaccess completely, both images are displayed.

cmyk commented 12 years ago

Ha! Just found the problem: Lines 209–211: <FilesMatch "\.(htm|php|js|css|htc|png|gif|jpe?g|ico|xml|csv|txt|swf|flv|eot|woff|svg|ttf|pdf|gz)$"> RewriteEngine Off </FilesMatch>

Add JPG to that list and it works. Maybe make that whole thing case-insensitive?

xchs commented 12 years ago

Huh, do we really need to write something like this here?

<FilesMatch "\.([Hh][Tt][Mm][Ll]|[Pp][Hh][Pp]|[Jj][Ss]|[Cc][Ss][Ss]|[Hh][Tt][Cc]|[Pp][Nn][Gg]|[Gg][Ii][Ff]|[Jj][Pp][Ee]?[Gg]|[Ii][Cc][Oo]|[Xx][Mm][Ll]|[Cc][Ss][Vv]|[Tt][Xx][Tt]|[Ss][Ww][Ff]|[Ff][Ll][Vv]|[Ee][Oo][Tt]|[Ww][Oo][Ff][Ff]|[Ss][Vv][Gg]|[Tt][Tt][Ff]|[Pp][Dd][Ff]|[Gg][Zz])$">
...

Quite weird!

leofeyer commented 12 years ago

Hm, please try the following:

<FilesMatch "\.(?i:htm|php|js|css|htc|png|gif|jpe?g|ico|xml|csv|txt|swf|flv|eot|woff|svg|ttf|pdf|gz)$">
RewriteEngine Off
</FilesMatch>

(The difference is the ?i: at the beginning of the regular expression.)

leofeyer commented 12 years ago

BTW, which Apache version are you on?

xchs commented 12 years ago

BTW, which Apache version are you on?

In fact, this is the point with the "?i" parameter...

cmyk commented 12 years ago

Yep, make the whole regex case-insensitive is the solution. Thanks, Leo! Apache 2.2.22

leofeyer commented 12 years ago

Strange that the HTML5Boilerplate does not use case-insensitive expressions. Is this maybe configurable in the Apache settings?

xchs commented 12 years ago

Does this mean that we have to add the mode modifier ?i: to all the FilesMatch directives in the .htaccess file? Is there some performance impact when we apply this modifier too often in the server configuration file?

leofeyer commented 12 years ago

No, this cannot be the solution. It did work on my system without the modifier, so it is definitely not required. Maybe a configurable Apache feature?

leofeyer commented 12 years ago

I am getting closer to the core of the problem. It has to do with the file system it seems:

https://trac.macports.org/ticket/7277 http://support.apple.com/kb/TA22750?viewlocale=en_US

So, which file system are you using?

cmyk commented 12 years ago

It's a problem, both on HFS+ and on Linux ext4.

leofeyer commented 12 years ago

Then it is what @xchs asked above: Is there some performance impact when we apply this modifier too often in the server configuration file?

xchs commented 12 years ago

Searched awhile and found some interesting remarks in the book "Mastering Regular Expressions" by Jeffrey Friedl regarding optimizing regular expressions for efficient performance in general, and case-insensitive pattern modifiers in particular.

Since Apache 2 bundles a PCRE library I guess that many of the statements apply here as well:

The Efficiency Penalty of the /i Modifier

If you ask Perl do match in a case-insensitive manner, common sense tells you that you are asking for more work to be done. You might be surprised, though, to find out just how much extra work that really is. Before a match or substitution operator applies an /i-governed regex, Perl first makes a temporary copy of the entire target string. This copy is in addition to any copy in support of $& and friends. The latter is done only after a successful match, while the one to support a case-insensitive match is done before the attempt. After the copy is made, the engine then makes a second pass over the entire string, converting any uppercase characters to lowercase. The result might happen to be :he same as the original, but in any case, all letters are lowercase. This goes hand in hand with a bit of extra work done during the compilation of the regex to an internal form. At that time, uppercase letters in the regex are converted to lowercase as well. The result of these two steps is a string and a regex that then matches normally—nothing special or extra needs to be done within the actual matching portion of the regex engine. It all appears to be a very tidy arrangement, but this has got to be one of the most gratuitous inefficiencies in all of Perl.

Methods to implement case-insensitive matching

There are (at least) two schools of thought on how to implement case-insensitive matching. We've just seen one, which I call string oriented (in addition to "gratuitously inefficient"). The other, which I consider to be far superior, is what I would call regex oriented. It works with the original mixed-case target string, having the engine itself make allowances for case-insensitive matching as the need arises.

A few /i benchmarks

[...] Simply adding the /i modifier [...] slowed the program by four orders of magnitude [...]

Final words about the /i penalty

Foremost: don't use /i unless you really have to. Blindly adding it to a regex that doesn't require it invites many wasted CPU cycles. In particular, when working with long strings, it can be a huge benefit to rewrite a regex to mimic the regex-oriented approach to case insensitivity, as I did with the last two benchmarks.

However, If someone can do some further "ab" testing concerning this issue it would be great, too.

leofeyer commented 12 years ago

As a first consequence, I have removed the /i modifier where applicable in 0538a8541c0e244bff2a9787608da02bf118b887. (Although this has nothing to do with the .htaccess file.)

leofeyer commented 12 years ago

I have added a note in 5bb272a9b9b9054db4d6255b4e6cff296c43e552.

Ruudt commented 11 years ago

Just asking: aren't most servers on case sensitive filesystems? If so, shouldn't the default be insensitive? I ran into this problem today and wondered why the case insensitive wasn't the default with a note on how to make it more efficient instead.

cmyk commented 11 years ago

@Ruudt Absolutely! This is totally backwards.

benunternaehrer commented 10 years ago

I had the same problem today... it works for me by adding JPG to the .htaccess... Is it possible to include JPG in the default .htaccess?