changcheng / wro4j

Automatically exported from code.google.com/p/wro4j
0 stars 0 forks source link

Content length is not computed correctly #485

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. I'm using curl to access the generated javascript
2. I'm using CssUrlRewritingProcessor, 
GoogleClosureCompressorProcessor(CompilationLevel.SIMPLE_OPTIMIZATIONS) and 
YUICssCompressorProcessor, resources are gzipped
3.

What is the expected output? What do you see instead?

The content length is wrong. Expected content length is 491373 but the header 
is only 491371. This causes 2 bytes missing at the end of the script

What version of the product are you using? On what operating system?

This happens since 1.4.7. 
1.4.6 has this problem not… But i guess only because Issue453 being present 
there.

The wrong content length stops firefox 13.0.1 and the current safari from 
downloading the rest of the script.

Please provide any additional information below.

Original issue reported on code.google.com by m...@planet-punk.de on 9 Jul 2012 at 7:04

GoogleCodeExporter commented 9 years ago
Please compare

wget -q -O- --header\="Accept-Encoding: gzip" 
http://dailyfratze.de/owr/dailyfratze-base-de.js | gunzip > out.html

and 

wget  http://dailyfratze.de/owr/dailyfratze-base-de.js 

Original comment by m...@planet-punk.de on 9 Jul 2012 at 7:20

GoogleCodeExporter commented 9 years ago

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 7:22

GoogleCodeExporter commented 9 years ago
{{{
public static void main(String...a) throws FileNotFoundException, IOException {
        final BufferedInputStream in = new BufferedInputStream(new FileInputStream(new File("/Users/msimons/tmp/dailyfratze-base-de.js.2")));
        byte[] b = new byte[1024];
        int len = 0;
        StringBuilder sb = new StringBuilder();
        while((len=in.read(b, 0, 1024))>0) {
            sb.append(new String(b, 0, len, Charset.forName("UTF-8")));
        }
        in.close();
        System.out.println("File length: " + new File("/Users/msimons/tmp/dailyfratze-base-de.js.2").length());
        System.out.println("String length (buffered inputstream) " + sb.toString().length());

        String s = IOUtils.toString(new FileInputStream(new File("/Users/msimons/tmp/dailyfratze-base-de.js.2")));
        System.out.println("String length (ioutils)" + s.length());
    }
}}}

Output is:

File length: 491373
String length (buffered inputstream) 491371
String length (ioutils)491371

dafuq?

Original comment by m...@planet-punk.de on 9 Jul 2012 at 7:48

Attachments:

GoogleCodeExporter commented 9 years ago
Could it be BOM character the reason?

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 7:50

GoogleCodeExporter commented 9 years ago
Maybe i look with the wrong regex but this file has no bom at the start… 
Maybe one of the scripts it consist of?

Original comment by m...@planet-punk.de on 9 Jul 2012 at 7:59

GoogleCodeExporter commented 9 years ago
I don't see the BOM either. Not sure what is the reason. Will do some research 
to find the possible cause... If you have any suggestions, let me know. 

Does this issue breaks the page rendering? 
Btw, do you have messenger ID?

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:01

GoogleCodeExporter commented 9 years ago
grep -rl $'\xEF\xBB\xBF' dailyfratze-base-de.js.2 doesn't show any…

Original comment by m...@planet-punk.de on 9 Jul 2012 at 8:01

GoogleCodeExporter commented 9 years ago
Yes it does… In the end 2 chars are missing… I'm downgrading to 1.4.4 right 
now…

Original comment by m...@planet-punk.de on 9 Jul 2012 at 8:02

GoogleCodeExporter commented 9 years ago
you can extend the filter and set the content-length header yourself as a 
temporary workaround. 

Have you noticed if the problem is consistent?

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:05

GoogleCodeExporter commented 9 years ago
Yes,  the problem is consistent and i think it relates to the simple example 
above as the ResourceBundleProcessor also uses the raw string length in line 109

{{{
response.setContentLength(cacheValue.getRawContent().length());
IOUtils.write(cacheValue.getRawContent(), os, configuration.getEncoding());
}}}

I don't know the content length in advance… I'd also go with the string 
length and this seems to be wrong…

Original comment by m...@planet-punk.de on 9 Jul 2012 at 8:12

GoogleCodeExporter commented 9 years ago
I'm still wondering why the file.legth is different than String.length. 

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:26

GoogleCodeExporter commented 9 years ago
Evrika! :)
The "String".length() != "String".getBytes().length.

Will fix it soon.

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:30

GoogleCodeExporter commented 9 years ago
This line (line 45 in the original file) causes the problems

Column 49 looks like a blank but isn't…

This must be from JQuery, the original code is this:

if ( rnotwhite.test( "\xA0" ) ) {
    trimLeft = /^[\s\xA0]+/;
    trimRight = /[\s\xA0]+$/;
}

JQuery 1.7.1 line 897

I mentioned the processors used above…

Original comment by m...@planet-punk.de on 9 Jul 2012 at 8:32

Attachments:

GoogleCodeExporter commented 9 years ago
Thought about something like this… One unicode character can be more than one 
byte… and length returns the chars, doesn't it?

Original comment by m...@planet-punk.de on 9 Jul 2012 at 8:34

GoogleCodeExporter commented 9 years ago
Probably. I fixed the issue in branch 1.4.x. Could you build the wro4j-core and 
confirm that the problem is fixed? 

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:37

GoogleCodeExporter commented 9 years ago

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:37

GoogleCodeExporter commented 9 years ago
And the explanation is:

There are UTF-8 characters which are stored on 2bytes (example: ä) 
"ä".length() == 1
"ä".getBytes().length == 2

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:51

GoogleCodeExporter commented 9 years ago

Original comment by alex.obj...@gmail.com on 9 Jul 2012 at 8:52

GoogleCodeExporter commented 9 years ago
Error is fixed. Thanks!

Original comment by m...@planet-punk.de on 9 Jul 2012 at 9:46

GoogleCodeExporter commented 9 years ago
Your fix in commit 471a424a78 didn't fix this issue for me.

I had to change it to:

response.setContentLength(cacheValue.getRawContent().getBytes(configuration.getE
ncoding()).length);

to get the correct content length

Original comment by Juh...@gmail.com on 12 Jul 2012 at 10:33

GoogleCodeExporter commented 9 years ago
Yep, the fix is probably better… I'm using UTF-8 and so is the default of my 
vm but this is not always the case.

Original comment by m...@planet-punk.de on 12 Jul 2012 at 10:44

GoogleCodeExporter commented 9 years ago
Thanks for noticing. I'll update it.

Original comment by alex.obj...@gmail.com on 12 Jul 2012 at 10:57

GoogleCodeExporter commented 9 years ago
The fix was updated in 1.4.x.

Original comment by alex.obj...@gmail.com on 12 Jul 2012 at 11:07