tliron / prudence

An opinionated lightweight web framework built for scale
https://prudence.threecrickets.com
Apache License 2.0
13 stars 5 forks source link

all.min.js taking 16 seconds to load in stickstick #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
1) What's the problem?
http://localhost:8080/stickstick/script/all.min.js taking 16 seconds to load.  
(Using FF4 and Chrome too)

I just found Prudence for the very first time several hours ago and am reading 
as much as I can!  Let me say - this is very very very impressive!  The 
thoughts around caching - fantastic!  

I am thinking perhaps that all.min.js is getting gzipped every time ?  I will 
try and learn more.

2) How can someone else reproduce it?
Install the prudence-javascript-R984.zip and run console

3) What operating system and JVM version are you using?
Windows XP and
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b07)
Java HotSpot(TM) Client VM (build 17.0-b17, mixed mode, sharing)

[Don't forget to set the release version and flavor below]
R984 and javascript flavor

Original issue reported on code.google.com by tgloc...@gmail.com on 28 Apr 2011 at 7:45

GoogleCodeExporter commented 9 years ago
Thank you, we will take a look soon!

Original comment by tal.liron on 29 Apr 2011 at 5:59

GoogleCodeExporter commented 9 years ago
Been reading the docs.  I see that all.min.js is something that can be created 
on the fly if a request comes for it and it doesn't exist.  Performed two 
tests, altered the stickstick html to use all the existing js and NOT 
all.min.js.  Performance seemed good.  Put the html back to original and put a 
small comment at top of all.min.js.  Again, it took 16 seconds and the comment 
was missing.  I suspect that all.min.js is therefore being created each time 
and the file exists check may be failing.  For what its worth - I have a 
directory on windows containing slashes.  I will continue to look at this.

Original comment by tgloc...@gmail.com on 2 May 2011 at 3:52

GoogleCodeExporter commented 9 years ago
I mean to say a directory containing whitespace in folder names.

Original comment by tgloc...@gmail.com on 2 May 2011 at 3:53

GoogleCodeExporter commented 9 years ago
Thank you very much for looking into it!

Your suspicion seems correct: it's likely a Windows-specific bug with 
JavaScriptUnifyMinifyFilter. We will look into this soon and hopefully release 
a fix.

Original comment by tal.liron on 2 May 2011 at 9:06

GoogleCodeExporter commented 9 years ago
I have a suspicion: are you running Prudence from a network drive? If so, would 
you mind testing from a non-network drive instead?

It seems that Windows' way of getting the lastModified date of a file involves 
copying the whole file over the network drive...

Original comment by tal.liron on 6 May 2011 at 2:02

GoogleCodeExporter commented 9 years ago
No its a local drive.  
The path is: C:\prudence\prudence\v11rev_984\applications\stickstick (no spaces)
I am not running with the daemon; only using 'run console'
When accessing http://localhost:8080/stickstick/script/all.min.js it is taking 
~12 seconds.   (I've defragged my drive since original post which is why it may 
be slightly faster)
All other requests are happening in less than 100ms
The exception being all.min.css which is taking between 150-230ms.  Still very 
fast.

The path is 
C:\prudence\prudence\v11rev_984\applications\stickstick\web\static\script - 
contains these entries:

04/27/2011  07:06 PM           290,532 all.min.js
04/27/2011  07:06 PM            72,174 jquery-1.4.2.min.js
04/27/2011  07:06 PM           213,511 jquery-ui-1.8rc3.custom.min.js
04/27/2011  07:06 PM             6,775 jquery.loading.min.js
04/27/2011  07:06 PM            17,308 json2.js
04/27/2011  07:06 PM             3,522 stickstick.js
04/27/2011  07:06 PM             4,230 stickynote.js
               7 File(s)        608,052 bytes

I will spend some time on this today.  

Original comment by tgloc...@gmail.com on 9 May 2011 at 1:13

GoogleCodeExporter commented 9 years ago
Looking at the routing.js in 
C:\prudence\prudence\v11rev_984\applications\stickstick

I think maybe the problem has to do with the last line ?

// Wrap the static web with unify-minify filters
var cssFilter = new CssUnifyMinifyFilter(null, new File(applicationBasePath + 
staticWebBasePath), minimumTimeBetweenValidityChecks)
var javaScriptFilter = new JavaScriptUnifyMinifyFilter(null, new 
File(applicationBasePath + staticWebBasePath), minimumTimeBetweenValidityChecks)
router.filterBase(staticWebBaseURL, cssFilter, staticWeb)
router.filterBase(staticWebBaseURL, javaScriptFilter, cssFilter)

I think that should be staticWeb ? or staticWeb/script ? (not cssFilter I would 
guess)

I will try changing it.

Original comment by tgloc...@gmail.com on 9 May 2011 at 1:31

GoogleCodeExporter commented 9 years ago
So much for that theory.  Should have read the docs first.  I will keep looking.

Original comment by tgloc...@gmail.com on 9 May 2011 at 1:43

GoogleCodeExporter commented 9 years ago
I have not been able to replicate this here (on Windows XP), but have tried 
reading up on what *could* be an issue in Windows, and it's lastModified that 
makes me suspicious.

Here's another thing to try: remove all except *one* JavaScript file in that 
directory. Is the unify-minify any faster? If true, we will no that it 
definitely has something to do with accessing these files.

Original comment by tal.liron on 9 May 2011 at 6:42

GoogleCodeExporter commented 9 years ago
i did something like that - i renamed jquery-ui-1.8rc3.custom.min so it had a 
different extension and hoped it would not get included.

I was thinking perhaps that that single file, might cause the minifier filter 
to behave differently because the file is already minified.

jquery-ui-1.8rc3.custom.min.js is also large at ~200k.

The time is now consistently around 5 seconds.  

So while the time is much smaller, all.min.js is still being regenerated on 
each request and the problem for me remains.

Original comment by tgloc...@gmail.com on 10 May 2011 at 1:05

GoogleCodeExporter commented 9 years ago
Also, i grabbed the latest 996 release and completely reinstalled in a new 
directory and ran stickstick untouched from the new bin directory.  

The issue for me still remains.

Original comment by tgloc...@gmail.com on 10 May 2011 at 1:33

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Darn, still haven't be able to reproduce here.

Are you using Windows XP with NTFS or with FAT32?

Original comment by tal.liron on 10 May 2011 at 2:03

GoogleCodeExporter commented 9 years ago
XP with NTFS.

Original comment by tgloc...@gmail.com on 10 May 2011 at 2:44

GoogleCodeExporter commented 9 years ago
XP with NTFS.  

The js files for stickstick are not compressed.  

But Archive and Index attributes are selected and the stickstick js files are 
'ready for archiving' and Indexing Service is able to index the file.  The last 
modified date of jquery-ui-1.8rc3.custom.min.js. is Friday, May 06, 2011, 
3:04:06 PM.

Original comment by tgloc...@gmail.com on 10 May 2011 at 2:54

GoogleCodeExporter commented 9 years ago
I upped the debugging in the log conf and started the console new so a new 
web.log and prudence.log would be created.  Attached them.  Nothing i see jumps 
out however.

Original comment by tgloc...@gmail.com on 10 May 2011 at 3:11

Attachments:

GoogleCodeExporter commented 9 years ago
Oh and I hit http://localhost:8080/stickstick/ once.

Original comment by tgloc...@gmail.com on 10 May 2011 at 3:12

GoogleCodeExporter commented 9 years ago
I went a step further to see if content-length in headers matched what is on my 
drive.  

Everything appears correct.

I don't know how to interpret the log, because unlike images, the js and css 
are chunked and content-length is not set.

/stickstick/style/all.min.css = 28,458 bytes
/stickstick/script/all.min.js = 290,532 bytes

2011-05-09  23:06:22    127.0.0.1   -   127.0.0.1   8080    GET /stickstick/style/all.min.css
    -   200 -   0   453 http://localhost:8080   Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) 
Gecko/20100101 Firefox/4.0.1    http://localhost:8080/stickstick/

2011-05-09  23:06:39    127.0.0.1   -   127.0.0.1   8080    GET /stickstick/script/all.min.js
    -   200 -   0   16860   http://localhost:8080   Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) 
Gecko/20100101 Firefox/4.0.1    http://localhost:8080/stickstick/

but I went a step further and ran fiddler while hitting stickstick from firefox 
4.

I can now see the content-length.

all.min.css on my drive matches fiddler at Content-Length: 28458
all.min.js on my drive also matches fiddler at Content-Length: 290532 

Original comment by tgloc...@gmail.com on 10 May 2011 at 3:40

GoogleCodeExporter commented 9 years ago
Thank you! This is all very helpful. Tomorrow we will try many experiments with 
our XP machine and try to reproduce this annoying bug...

Original comment by tal.liron on 10 May 2011 at 5:12

GoogleCodeExporter commented 9 years ago
I bumped the logging to trace what I could.  

Now seeing some errors.  

If you want to supply a logging.conf to capture everything I would be happy to 
use it.

Attached are my web.log, prudence.log and logging.conf

Original comment by tgloc...@gmail.com on 10 May 2011 at 3:15

Attachments:

GoogleCodeExporter commented 9 years ago
Quick question: is access to all.min.css slow on this machine, too?

You can simulate more files easily by just throwing in lots of css files into 
the directory. :)

Original comment by tal.liron on 11 May 2011 at 2:21

GoogleCodeExporter commented 9 years ago
all.min.css has been a bit higher this entire time (around 250ms) and i had 
been thinking to do such a test as well.

So yes, I added many css files and the time rises and stays there no matter how 
often i hit the page or open the page in a new tab.

The issue is likely in a baseclass or lower and not in the jsminifier or 
cssminifier code.

I was looking at source yesterday to see what debugging messages might be 
echoed.

Wishing my java skills were better but i've only ever written a handful of 
servlet code and a couple filters.

Original comment by tgloc...@gmail.com on 11 May 2011 at 12:12

GoogleCodeExporter commented 9 years ago
I have found a fix!

I noticed the errors in my attached prudence log coming from 
com.threecrickets.prudence.internal.attributes.ContextualAttributes.validateDocu
mentName().

That code is looking for a trailing slash and not finding it, throws the error.

 if( isTrailingSlashRequired() )
                        if( ( documentName != null ) && ( documentName.length() != 0 ) && !documentName.endsWith( "/" ) )
                                throw new ResourceException( Status.CLIENT_ERROR_NOT_FOUND );

I altered the index.html for stickstick to append a trailing slash to both the 
all.min.css and all.min.js files:

    <link rel="stylesheet" media="screen" type="text/css" href="style/all.min.css/"/>
    <script type="text/javascript" src="script/all.min.js/"></script>

The all.min files now return in ~150 to ~170 milliseconds.

So in effect, the all.min files were never found and the code decided to 
recreate them each time.

I would like your input on whether the trailing slash should be necessary for 
css or js files or would a config setup be preferable.  

I am still reading about REST and very much a newbie.

Original comment by tgloc...@gmail.com on 11 May 2011 at 1:40

GoogleCodeExporter commented 9 years ago
As I think about it, I suppose there is other code more specific to the all.min 
files that has some issue surrounding the trailing slash.  I say this because 
for other applications that are not using the all.min functionality, there is 
no problem and slash is not required.

At any rate, i will let you think on this.  If there is anything more i can do 
to help please don't hesitate.

Original comment by tgloc...@gmail.com on 11 May 2011 at 1:52

GoogleCodeExporter commented 9 years ago
Excellent piece of debugging, and this tells me exactly where the problem is.

It's a bit deeper in the Prudence code to get to this, but I will find it and 
fix it. Good catch! This would likely effect other similar kinds of filters.

Original comment by tal.liron on 11 May 2011 at 5:38

GoogleCodeExporter commented 9 years ago
OK. I was being stupid. (Didn't have my morning coffee yet!)

The bottom line is that this is not a Windows-specific bug, it's only 
exacerbated in Windows. The code for UnifyMinifyFilter had a regression: the 
file was being deleted *before* checking for the lastModified date, which of 
course would then return -1, causing the unification/minifcation process to 
happen *every time*. This affects *all* Prudence users, no matter the OS, but 
somehow in Windows it is especially slow and noticeable.

I put a fix on Subversion, and tonight will release an RC of Prudence (which 
will include fixes to a few other bugs).

As to the exception you were seeing in the logs: it was only because you 
enabled DEBUG level logging. The exception is actually correct, and emitted by 
the dynamic web resource, which indeed requires trailing slashes and could not 
handle the URI. But that's OK: the fallback router goes to the static web 
resource next, which is wrapped in this filter, which does handle the request. 
(Except that until now it handled the request poorly...)

Original comment by tal.liron on 11 May 2011 at 6:02

GoogleCodeExporter commented 9 years ago
RC10 was released today with the bug fix. Would love to know if it works!

Original comment by tal.liron on 12 May 2011 at 5:01

GoogleCodeExporter commented 9 years ago
It works perfectly!  

Initial hit takes a while, (as expected) but then all is sweet.

I was looking at the code you fixed, it looks like (clearly) that the static 
directory must be swept to get the most recent time of appropriate css or js 
files and that date is compared to the all.min file date.  If necessary, 
regenerate the min file.  

Which makes logical sense.  But I have some questions about what happens on 
each request.  

I see the Expires header is future dated 10 mins therefore for a given client, 
the all.min comes from local cache during that time.  But what about another 
client ?  Does that request cause the code to sweep the directory ?  Or is the 
all.min file cached somewhere server side in prudence and the 10 minutes is a 
reflection of the last time the all.min was created ?  Finally regarding the 10 
minute duration - where is that configured ? 

At any rate, I am going to try really writing some code using Prudence and see 
where I end up.  

For the record, I deal with global applications for a fortune 50 company and 
far too often a caching strategy is an afterthought instead of a design 
consideration.  Static resources which cause 304s to be returned impact scaling 
when the server is across the globe and there are 10 network hops just to 
respond to an "If-modified" with 304 "No the file has not changed"..  

A framework that enables caching in memory on server, with being able to set 
Expires or MaxAge is all good.  Not to mention that the all.min functionality 
minimizes chatter and additionally obfuscates the js too.  Security never likes 
to see comments in script or have it be readable.

I think this framework is awesome and I very much appreciate the effort being 
put in.

Original comment by tgloc...@gmail.com on 12 May 2011 at 3:01

GoogleCodeExporter commented 9 years ago
Your story is very familiar: for a very long time, people have treated HTTP as 
ugly baggage. Ugly or not, it's the infrastructure, and there's a LOT to gain 
from working with it. That's why Fielding's dissertation (introducing REST) was 
so important.

1) all.js.min's source files are checked only ONCE per JVM per the cache 
duration given. It doesn't matter how many requests come in -- only when the 
alloted time is passed will it check the files again. You'll see in 
Stickstick's settings.js that it is set to be the application-wide 
minimumTimeBetweenValidityChecks, which is 5 seconds by default.

2) The .min file is generated on disk and then served just like any static web 
server. The Prudence Manual has a pretty thorough section on how to configure 
it, including how to utilize client-side caching so you can avoid HTTP entirely 
(even conditional HTTP), helping exactly the situation you mention. Of course, 
it's probably best to rely on a world-wide CDN for static resources, but that's 
not an option for every app.

The 10 minutes expires is curious, and I will look into it. It might be a 
Restlet default. But, anyway, let's consider this bug closed. :) Thanks again 
for your help!

Original comment by tal.liron on 13 May 2011 at 3:09