apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
697 stars 158 forks source link

Apache stuck indefinitely waiting for PSOL #1048

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 9 years ago
I just installed mod_pagespeed on my centos 7 and got tons of httpd errors log 
in 1 minute. An example line of error:
<code>
[Tue Feb 10 11:05:14.311755 2015] [pagespeed:warn] [pid 21132:tid 
139634310850304] [mod_pagespeed 1.9.32.3-4448 @21132] Waiting for completion of 
URL http://exampledomain.com/example-slug/ for 45.001 sec
</code>

ALL requests got error, include image requests too.

My server hardware specs:
* Intel(R) Xeon(R) CPU E3-1246 v3 @ 3.50GHz, 8 cores
* 32 GB DDR3 RAM
* 2 x 2 TB SATA 6 Gb/s 7200 rpm HDD (Software-RAID 1) Class Enterprise

Software specs:
Operating system: CentOS Linux 7.0.1406
Kernel: Linux 3.10.0-123.20.1.el7.x86_64 on x86_64

Server Version: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips mod_fcgid/2.3.9 
PHP/5.6.5 mod_perl/2.0.9dev Perl/v5.16.3
Server MPM: event

What version of the product are you using (please check X-Mod-Pagespeed
header)?
mod-pagespeed-stable-1.9.32.3-4448.x86_64

URL of broken page:
I removed module after 1 minute of terror. If a google developer want to learn 
more of server information, mail me an ip address that i can give permission to 
look mod_info print.

Original issue reported on code.google.com by unsalkor...@gmail.com on 10 Feb 2015 at 9:35

GoogleCodeExporter commented 9 years ago
I have the same problem with the same version of pagespeed.

I'm on Debian Wheezy.

Original comment by hug...@betabrand.com on 19 Feb 2015 at 6:40

GoogleCodeExporter commented 9 years ago
This looks like a bug to me in:

void ApacheFetch::Wait() {
  MessageHandler* handler = server_context_->message_handler();
  Timer* timer = server_context_->timer();
  int64 start_ms = timer->NowMs();
  {
    ScopedMutex lock(mutex_.get());
    while (!done_) {
      condvar_->TimedWait(blocking_fetch_timeout_ms_);
      if (!done_) {
        int64 elapsed_ms = timer->NowMs() - start_ms;
        handler->Message(
            kWarning, "Waiting for completion of URL %s for %g sec",
            mapped_url_.c_str(), elapsed_ms / 1000.0);
      }
    }
  }
}

I think what we're supposed to be doing is blocking for a maximum of 
blocking_fetch_timeout_ms_, but instead we block until Done() is called and 
just log an error every blocking_fetch_timeout_ms_.

Original comment by jefftk@google.com on 19 Feb 2015 at 8:36

GoogleCodeExporter commented 9 years ago
In the mean time, could you try turning off IPRO?

    ModPagespeedInPlaceResourceOptimization off

Original comment by jefftk@google.com on 19 Feb 2015 at 8:42

GoogleCodeExporter commented 9 years ago
I'll try that and report back.

Thanks.

Original comment by hug...@betabrand.com on 19 Feb 2015 at 8:43

GoogleCodeExporter commented 9 years ago
Turns out the logging behavior is intentional:

  // Blocks indefinitely waiting for the proxy fetch to complete.
  // Every 'blocking_fetch_timeout_ms', log a message so that if
  // we get stuck there's noise in the logs, but we don't expect this
  // to happen because underlying fetch/cache timeouts should fire.
  //
  // Note that enforcing a timeout in this function makes debugging
  // difficult.
  void Wait();

Still not clear why the fetch/cache timeouts aren't firing.

Original comment by jefftk@google.com on 19 Feb 2015 at 8:44

GoogleCodeExporter commented 9 years ago
Have you set any of the pagespeed timeouts to strange values, like -1?  Are 
there any other ways your PageSpeed configuration or server setup is unusual?  
Or is this pretty much a stock install?

Original comment by jefftk@google.com on 19 Feb 2015 at 8:45

GoogleCodeExporter commented 9 years ago
The only place referring to timeouts in my config are:

ModPagespeedMemcachedTimeoutUs           50000
ModPagespeedRewriteDeadlinePerFlushMs   100

## Not a timeout, but maybe relevant
ModPagespeedImageMaxRewritesAtOnce      0

I also don't run into this problem for files that are declared with 
ModPagespeedLoadFromFileMatch

Original comment by hug...@betabrand.com on 19 Feb 2015 at 8:49

GoogleCodeExporter commented 9 years ago
Reporting back: disabling IPRO did not fix the issue.

Original comment by hug...@betabrand.com on 19 Feb 2015 at 10:57

GoogleCodeExporter commented 9 years ago
I also noticed that:

1- contrary to what I said earlier, the bug also occurs for resources declared 
with ModPagespeedLoadFromFileMatch

2- It seems to happen with .html files (which I feel like is weird; but since I 
don't know the internals of MPS, could seem logical to you)

[Thu Feb 19 14:57:18 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @29090] Waiting 
for completion of URL 
http://www.betabrand.com/mens-gray-chambray-drawstring-pants.html for 6730.16 
sec
[Thu Feb 19 14:57:18 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @29090] Waiting 
for completion of URL 
http://www.betabrand.com/womens/more/navy-blue-cornucopia-water-resistant-daypac
k.html for 6730.14 sec
[Thu Feb 19 14:57:18 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @29090] Waiting 
for completion of URL 
http://www.betabrand.com/mens-removable-hood-gekko-vest.html for 6640.23 sec

Original comment by hug...@betabrand.com on 19 Feb 2015 at 11:01

GoogleCodeExporter commented 9 years ago
Hi there,

I'd like to mention that this bug is fairly important as it is triggered all 
the time and its effect is apache2 hogging the CPU. The only fix is killing the 
apache2 process.

Also, is it possible that using ModPagespeedBlockingRewriteKey could be a cause 
of this bug?

I'm setting the X-PSA-Blocking-Rewrite header on all my requests.

Original comment by hug...@betabrand.com on 20 Feb 2015 at 8:31

GoogleCodeExporter commented 9 years ago
I have the same problem with the same version of pagespeed.

I'm on CENTOS 6.6 x86_64 standard, Apache 2.4.12.

[Sat Feb 28 15:21:10.939879 2015] [pagespeed:warn] [pid 49502:tid 
140315103012608] [mod_pagespeed 1.9.32.3-4448 @49502] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2015/01/Mozart-La-Para-Ft-Anthony-San
tos-Pa-Gozar-Remix.jpg for 10 sec
[Sat Feb 28 15:21:11.526066 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
40.002 sec
[Sat Feb 28 15:21:15.940041 2015] [pagespeed:warn] [pid 49502:tid 
140315103012608] [mod_pagespeed 1.9.32.3-4448 @49502] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2015/01/Mozart-La-Para-Ft-Anthony-San
tos-Pa-Gozar-Remix.jpg for 15.001 sec
[Sat Feb 28 15:21:16.526242 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
45.002 sec
[Sat Feb 28 15:21:21.526388 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
50.002 sec
[Sat Feb 28 15:21:21.744521 2015] [pagespeed:warn] [pid 49819:tid 
140315228890880] [mod_pagespeed 1.9.32.3-4448 @49819] Waiting for completion of 
URL 
http://www.corourbano.com/w/wp-content/uploads/2013/09/680px_dedd825780164515bdb
1b16841c61be9.png for 5 sec
[Sat Feb 28 15:21:26.526512 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
55.002 sec
[Sat Feb 28 15:21:31.526660 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
60.002 sec
[Sat Feb 28 15:21:36.526792 2015] [pagespeed:warn] [pid 49687:tid 
140315144972032] [mod_pagespeed 1.9.32.3-4448 @49687] Waiting for completion of 
URL http://corourbano.com/w/wp-content/uploads/2014/04/Paramba-ps-4.jpg for 
65.002 sec
[Sat Feb 28 15:21:40.631322 2015] [core:error] [pid 49895:tid 140315355662080] 
[client 198.7.62.69:57760] Script timed out before returning headers: 
wp-cron.php
[Sat Feb 28 15:22:26.233396 2015] [pagespeed:warn] [pid 49592:tid 
140315228890880] [mod_pagespeed 1.9.32.3-4448 @49592] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2014/11/j-balvin-latin-grammy-slide-n
ewsjpg.jpg for 5 sec
[Sat Feb 28 15:22:44.002184 2015] [pagespeed:warn] [pid 49819:tid 
140315176441600] [mod_pagespeed 1.9.32.3-4448 @49819] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2014/07/tu-quiere-chapiame-chimbala-f
t-la-material.jpg for 5 sec
[Sat Feb 28 15:22:49.002314 2015] [pagespeed:warn] [pid 49819:tid 
140315176441600] [mod_pagespeed 1.9.32.3-4448 @49819] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2014/07/tu-quiere-chapiame-chimbala-f
t-la-material.jpg for 10 sec
[Sat Feb 28 15:22:54.002439 2015] [pagespeed:warn] [pid 49819:tid 
140315176441600] [mod_pagespeed 1.9.32.3-4448 @49819] Waiting for completion of 
URL 
http://corourbano.com/w/wp-content/uploads/2014/07/tu-quiere-chapiame-chimbala-f
t-la-material.jpg for 15 sec

Any suggestions on this?

Original comment by CoroUrba...@gmail.com on 28 Feb 2015 at 7:34

GoogleCodeExporter commented 9 years ago
I also have this problem, anyone can solve ?

Original comment by huynhvut...@gmail.com on 7 Mar 2015 at 4:51

GoogleCodeExporter commented 9 years ago
I suspect (but don't know for sure) that use of ModPagespeedBlockingRewriteKey 
might not interact well with in-place fetches.  It would be better to set a 
longer rewrite deadline.

Nevertheless I think this is a real bug and we should figure out who should 
tackle it.

Original comment by jmara...@google.com on 8 Mar 2015 at 5:14

GoogleCodeExporter commented 9 years ago
Is everybody on this thread with the bug using ModPagespeedBlockingRewriteKey 
though?

Original comment by hug...@betabrand.com on 8 Mar 2015 at 8:36

GoogleCodeExporter commented 9 years ago
Looking at the code, the class in question (ApacheFetch) is used in three 
different scenarios:

1. admin-page handling (which I think is not at issue here based on the log 
messages)
2. ModPagespeedInPlaceResourceOptimization (on by default starting in 1.9)
3. ModPagespeedMapProxyDomain

It would be helpful if each person experiencing this bug let us know if they 
are using any of the above features, specifying with or without a 
BlockingRewriteKey.

Original comment by jmara...@google.com on 9 Mar 2015 at 1:37

GoogleCodeExporter commented 9 years ago
Someone found the problem? I'm having this and I'm trying to find out.

Original comment by lambert....@gmail.com on 11 Mar 2015 at 11:34

GoogleCodeExporter commented 9 years ago
In order to help finding the source of this bug it would help if you posted the 
information stated in comment #15: 
https://code.google.com/p/modpagespeed/issues/detail?id=1048#c15

Original comment by hug...@betabrand.com on 11 Mar 2015 at 11:37

GoogleCodeExporter commented 9 years ago
I have the same problem with an Ubuntu-server 13.10

[Tue Mar 31 14:41:20.129694 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 345.023 
sec
[Tue Mar 31 14:41:25.129988 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 350.023 
sec
[Tue Mar 31 14:41:30.130250 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 355.024 
sec
[Tue Mar 31 14:41:35.130507 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 360.024 
sec
[Tue Mar 31 14:41:40.130767 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 365.024 
sec
[Tue Mar 31 14:41:45.131062 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 370.025 
sec
[Tue Mar 31 14:41:50.131376 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 375.025 
sec
[Tue Mar 31 14:41:55.131638 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 380.025 
sec
[Tue Mar 31 14:42:00.131922 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 385.025 
sec
[Tue Mar 31 14:42:05.132191 2015] [pagespeed:warn] [pid 1166] [mod_pagespeed 
1.9.32.3-4448 @1166] Waiting for completion of URL 
http://cometeelmundo.net/sites/default/files/styles/originalwatermark/public/fie
ld/image/Interior%20catedral%20de%20San%20Basilio.jpg?itok=TKiOM8ek for 390.026 
sec

Original comment by foreve...@gmail.com on 31 Mar 2015 at 1:22

GoogleCodeExporter commented 9 years ago
It would be helpful if each person experiencing this bug let us know if they 
are using either of the these features:

 -  ModPagespeedInPlaceResourceOptimization (on by default starting in 1.9)
 -  ModPagespeedMapProxyDomain

Please also let us know if you also use a BlockingRewriteKey.

Original comment by jmara...@google.com on 31 Mar 2015 at 1:30

GoogleCodeExporter commented 9 years ago
If I understand correctly ModPagespeedInPlaceResourceOptimization is enabled by 
default on 1.9 without specifying it on the config file, so in this case I have 
it enabled. 

I'm not using ModPagespeedMapProxyDomain and netiher a BlockingRewriteKey.

If you need more information feel free to ask for it.

Original comment by foreve...@gmail.com on 31 Mar 2015 at 1:34

GoogleCodeExporter commented 9 years ago
I have this problem as well. I regularly need to restart the Apache2 service 
because I use 100% CPU. It happens almost twice a day. It's a problem because 
the rest of the time mod_pagespeed benefits are real and great....

Original comment by vedio...@gmail.com on 31 Mar 2015 at 7:20

GoogleCodeExporter commented 9 years ago
We're still trying to figure out what's causing this, but a workaround for now 
is to disable in-place resource optimization:

    ModPagespeedInPlaceResourceOptimization off

Original comment by jefftk@google.com on 31 Mar 2015 at 7:49

GoogleCodeExporter commented 9 years ago
    ModPagespeedInPlaceResourceOptimization off didn't stop my CPU to go high today. I had to restart the service again.

A few examples of what I have in my apache2 error logs (no idea if it is 
related) : 
=> tail -F /var/log/apache2/error.log 

[Wed Apr 01 13:04:40.539741 2015] [pagespeed:error] [pid 6440] [mod_pagespeed 
1.9.32.3-4448 @6440] http://www.xxxxxx.ch/image/ico_delete_small.gif 
(connecting to:%22):0: Error status=670002 (Name or service not known) 
serf_connection_create2
[Wed Apr 01 13:04:40.539799 2015] [pagespeed:warn] [pid 6440] [mod_pagespeed 
1.9.32.3-4448 @6440] Fetch failed to start: 
http://www.xxxxxx.ch/image/ico_delete_small.gif (connecting to:%22)
[Wed Apr 01 13:27:42.192734 2015] [pagespeed:warn] [pid 8212] [mod_pagespeed  
@8212] [0401/132742:WARNING:queued_worker_pool.cc(432)] Canceling 68 functions 
on sequence Shutdown

Original comment by vedio...@gmail.com on 1 Apr 2015 at 4:36

GoogleCodeExporter commented 9 years ago
vediovis: do you have MapProxyDomain in your config?  Or BlockingRewriteKey?  
Did you flush your cache after turning off InPlaceResourceOptimization?

foreveryo: did you try flushing your cache and turning off 
InPlaceResourceOptimization?

Original comment by jmara...@google.com on 1 Apr 2015 at 5:21

GoogleCodeExporter commented 9 years ago
jmara: I do not use MapProxyDomain or BlockingRewriteKey
How can I flush the cache? 

The CPU goes high all the time now. It seems the longer the installation, the 
higher frequence of high CPU usage.

Original comment by vedio...@gmail.com on 2 Apr 2015 at 7:45

GoogleCodeExporter commented 9 years ago
jmara: I do not use MapProxyDomain or BlockingRewriteKey
How can I flush the cache? 

The CPU goes high all the time now. It seems the longer the installation, the 
higher frequence of high CPU usage.

Original comment by vedio...@gmail.com on 2 Apr 2015 at 7:48

GoogleCodeExporter commented 9 years ago
I have set ModPagespeedInPlaceResourceOptimization off and cleared the cache (I 
am using memcached) I will come back in a few hours to let us know how is 
working.

Original comment by foreve...@gmail.com on 2 Apr 2015 at 8:20

GoogleCodeExporter commented 9 years ago
To clear cache, by default you do:
  sudo touch $FILE_CACHE_PATH/cache.flush

Details are 
here:https://developers.google.com/speed/pagespeed/module/system#flush_cache

Note that it is not sufficient to restart memcached, because there is also an 
in-memory L1 cache that will not be flushed when you do that.

Original comment by jmara...@google.com on 2 Apr 2015 at 12:38

GoogleCodeExporter commented 9 years ago
jmara : I use OPcache from Zend as well. Do you need my configuration file ?

Original comment by vedio...@gmail.com on 2 Apr 2015 at 2:00

GoogleCodeExporter commented 9 years ago
I was mainly concerned about the mod_pagespeed cache.  Other caches upstream or 
downstream of mod_pagespeed should not matter.

Original comment by jmara...@google.com on 2 Apr 2015 at 2:02

GoogleCodeExporter commented 9 years ago
Any updates on this issue? We are experiencing those errors on the logs with 
very minor configuration differences than the default configuration.

Those are the changes we have made (comparing the original and our files):

    ModPagespeedEnableFilters collapse_whitespace,elide_attributes,trim_urls
    ModPagespeedEnableFilters remove_quotes,remove_comments

    ModPagespeedDomain **********9.cloudfront.net
    ModPagespeedDomain **********h.cloudfront.net
    ModPagespeedDomain **********k.cloudfront.net

The errors we are getting:
[Thu Apr 02 08:04:51 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @13247] Waiting 
for completion of URL http://www.mydomain.com/mydir/images/068.JPG for 75.002 
sec
[Thu Apr 02 08:04:51 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @6814] Waiting 
for completion of URL http://www.mydomain.com/mydir/images/a1.jpg for 75.002 sec
[Thu Apr 02 08:04:51 2015] [warn] [mod_pagespeed 1.9.32.3-4448 @4789] Waiting 
for completion of URL http://www.mydomain.com/mydir/images/062.JPG for 75.002 
sec

If you need anything else, please let me know.

Original comment by m.moko...@gmail.com on 2 Apr 2015 at 3:52

GoogleCodeExporter commented 9 years ago
BTW, forgot to mention that if you request the files (images in our case) 
directly through a browser (chrome) they are downloaded in less than a second. 
Not sure why it takes page speed so long to fetch them.

Original comment by m.moko...@gmail.com on 2 Apr 2015 at 3:55

Cyriltra commented 9 years ago

Hi, same issue with Debian 7.8 64bits / Apache 2.2.22. I had to disable the module to prevent crash (got 2 in less than a month)

crowell commented 9 years ago

@Cyriltra did you have IPRO enabled? it is enabled by default on 1.9.32.x

Cyriltra commented 9 years ago

@crowell yes i left it enable (by default). I didn't try to disable it as another user said before it still not fix the issue for him. So I didn't dare to take that risk as I was in rush to fix the issue. I'll give a try this weekend

jeffkaufman commented 9 years ago

This looks very suspicious:

void ApacheFetch::Wait() {
  ScopedMutex lock(mutex_.get());
  while (!done_) { ... }
}
...
void ApacheFetch::HandleDone(bool success) {
  ScopedMutex lock(mutex_.get());
  done_ = true;
}
oschaaf commented 9 years ago

@jeffkaufman fwiw, looks like something that could very well explain the behaviour described here to me indeed.

jeffkaufman commented 9 years ago

Looking more and talking to @jmarantz that's suspicious but actually things working properly. In the snipped code theres a Condvar:

mutex_(server_context->thread_system()->NewMutex()),
condvar_(mutex_->NewCondvar()),
...
void ApacheFetch::Wait() {
  ScopedMutex lock(mutex_.get());
  while (!done_) {
    condvar_->TimedWait(blocking_fetch_timeout_ms_);
  }
}
...
void ApacheFetch::HandleDone(bool success) {
  ScopedMutex lock(mutex_.get());
  done_ = true;
  condvar_->Signal();
}

I believe, but I haven't verified, that the lock is released while in a timed wait on the condvar, and then when you signal the condvar it wakes up and grabs the lock again.

jeffkaufman commented 9 years ago

Verfied about the condvar. TimedWait() delegates to pthread_cond_timedwait.

oschaaf commented 9 years ago
InstawebHandler::~InstawebHandler() {
  WaitForFetch();
}

I might have an unnatural focus on shutting down :-) But what happens during when Apache recycles worker processes (assuming it does that)? Will ApacheFetch::Done be called somehow? If not, these wait calls might not ever stop waiting.

jeffkaufman commented 9 years ago

@oschaaf All our uses of ApacheFetch should look like:

  1. MakeFetch()
  2. SomeFetcher->Fetch(fetch)
  3. WaitForFetch()

Which I think means WaitForFetch() in the destructor should never be called.

morlovich commented 9 years ago

It should also quick-exit because done is already true.

On Fri, Apr 10, 2015 at 9:12 AM, Jeff Kaufman notifications@github.com wrote:

@oschaaf https://github.com/oschaaf All our uses of ApacheFetch should look like:

  1. MakeFetch()
  2. SomeFetcher->Fetch(fetch)
  3. WaitForFetch()

Which I think means WaitForFetch() in the destructor should never be called.

— Reply to this email directly or view it on GitHub https://github.com/pagespeed/mod_pagespeed/issues/1048#issuecomment-91552185 .

oschaaf commented 9 years ago

@morlovich What would happen when a BoundedWait() call for a driver times out during ServerContext::ShutDown()?

jeffkaufman commented 9 years ago

For debugging this, it would be useful to look at the pagespeed configuration files for the different sites that are having this problem. If this problem has happened to you, could you send your server configuration file to jefftk@google.com?

eldk commented 9 years ago

Hello,

Have the same problem : Waiting for completion of URL, on some pictures

Seems to be direct access to page with no html, or script (jpg ... shown in browser) or request from google picture, bing ...

http://domain.tld/picture.jpg

I try to reproduce it myself, but I can't.

I have disabled image rewriting in mod_pagespeed.

I have send logs and server configuration file to jeffkaufman.

Greatings, Eric

Version of mod_pagespeed : 1.9.32.3-4448

One exemple apache access log : host-85-27-98-185.dynamic.voo.be - - [29/Apr/2015:15:49:29 +0200] "GET /IMG/lluenta_deltebre_mars_2010.jpg HTTP/1.1" 200 1194044 "https://www.google.be/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_2 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko)Version/7.0 Mobile/11D257 Safari/9537.53" mod_pagespeed message : [Wed, 29 Apr 2015 13:49:34 GMT] [Warning] [21354] Waiting for completion of URL http://www.opalesurfcasting.net/IMG/lluenta_deltebre_mars_2010.jpg for 5 sec

another one : apache access log : vau75-5-82-227-220-228.fbx.proxad.net - - [29/Apr/2015:15:52:30 +0200] "GET /la_faune_aquatique/le_saumon_de_latlantique-_salmo_salar_article1193.html HTTP/1.1" 200 15148 "http://www.google.fr/imgres?imgurl=http%3A%2F%2Fwww.opalesurfcasting.net%2FIMG%2Ftete_saumon_atlantique_salmo_salar.jpg&imgrefurl=http%3A%2F%2Fwww.opalesurfcasting.net%2Fla_faune_aquatique%2Fle_saumon_de_l_atlantique_-_salmo_salar_article1193.html&h=1122&w=1691&tbnid=qsVjealN7I-NfM%3A&zoom=1&docid=kfqxuIku1-J_jM&ei=EuJAVebXJsPWapDFgbgN&tbm=isch&iact=rc&uact=3&dur=241&page=1&start=0&ndsp=23&ved=0CDwQrQMwAg" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36" mod_pagespeed message : [Wed, 29 Apr 2015 13:52:40 GMT] [Warning] [21566] Waiting for completion of URL http://www.opalesurfcasting.net/IMG/tete_saumon_atlantique_salmo_salar.jpg for 5 sec

a last one with bing images : apache access log : static-89-94-185-100.axione.abo.bbox.fr - - [29/Apr/2015:16:02:38 +0200] "GET /IMG/anguille_europeenne.jpg HTTP/1.1" 200 147568 "http://www.bing.com/images/search?q=anguille+jaune&view=detailv2&&&id=07BF14DD3BA5856A2498C64D26DAA1018D7AC082&selectedIndex=13&ccid=4ycpEiHq&simid=608006162357030372&thid=JN.JSa96n2qA4sSceWy9Zf2tQ&mode=overlay" "Mozilla/5.0 (Windows NT 6.3; Trident/7.0; Touch; rv:11.0) like Gecko" mod_pagespeed message : [Wed, 29 Apr 2015 14:02:43 GMT] [Warning] [22078] Waiting for completion of URL http://www.opalesurfcasting.net/IMG/anguille_europeenne.jpg for 5.031 sec

Hours is given with -2 in pagespeed messages

jmarantz commented 9 years ago

Follow-up from a recent message on the same symptom to mod-pagespeed-discuss revealed that this can be repro'd without memcached. In this particular case the file-cache was a on a path that looked suspiciously like it might be a network file system. Slow response from any file-system could cause this message to come out, but I'd be particularly suspicious of a network file system.

morlovich commented 9 years ago

Also, re: memcached --- the report that had logs with broken pipe errors with AprMemCached, those seem to be associated with use of haproxy with memcached, but also seem to be a false lead --- we appear to recover from them, at least per my loadtests.

On Fri, May 1, 2015 at 9:50 AM, jmarantz notifications@github.com wrote:

Follow-up from a recent message on the same symptom to mod-pagespeed-discuss revealed that this can be repro'd without memcached. In this particular case the file-cache was a on a path that looked suspiciously like it might be a network file system. Slow response from any file-system could cause this message to come out, but I'd be particularly suspicious of a network file system.

— Reply to this email directly or view it on GitHub https://github.com/pagespeed/mod_pagespeed/issues/1048#issuecomment-98138804 .

eldk commented 9 years ago

Hello,

Here : https://github.com/pagespeed/mod_pagespeed/issues/1072 It's said that disabled IPRO should be used as workaround. So, as said here : https://developers.google.com/speed/pagespeed/module/system in In-Place Resource Optimization what url will be used for example in Google Image : original one ? I'm new to mod_pagespeed, and doesn't want to loose some position in Google Image by changing urls for direct access to images. What will happened to those urls if IPRO is not used? Thanks, Eric

jmarantz commented 9 years ago

To clarify, in-place resource optimization allows images to be modified, but does not prevent them from being renamed, which is what mod_pagespeed normally does. This was originally created to address the situation where images were loaded from JavaScript, and thus PageSpeed had no opportunity to rename them.

To prevent images from being renamed you need to use ImagePreserveUrls https://developers.google.com/speed/pagespeed/module/config_filters#preserveurls. And if you do that with IPRO off then PageSpeed will not be able to optimize your images.

RE image-search rank: what makes you think you are going to lose image rank if their URLs are changed?

-Josh

On Mon, May 4, 2015 at 7:09 PM, eldk notifications@github.com wrote:

Hello,

Here : #1072 https://github.com/pagespeed/mod_pagespeed/issues/1072 It's said that disabled IPRO should be used as workaround. So, as said here : https://developers.google.com/speed/pagespeed/module/system in In-Place Resource Optimization what url will be used for example in Google Image : original one ? I'm new to mod_pagespeed, and doesn't want to loose some refered position in Google Image by changing urls for direct access to images. What will happened to those urls if IPRO is not used. Thanks, Eric

— Reply to this email directly or view it on GitHub https://github.com/pagespeed/mod_pagespeed/issues/1048#issuecomment-98881384 .

eldk commented 9 years ago

Hello,

I have just disabled IPRO for 24h00 now with "ModPagespeedInPlaceResourceOptimization off" in pagespeed.conf, All the "waiting for completion of URL" messages have disappeared.

Due to bandwith usage change beetwen IPRO OFF and IPRO ON, I think the error happened only for some ressources.

One off the URL which was the most seen in mod_pagespeed log when "waiting for completion of URL" happened was a direct access (without html - Content-Type: image/jpeg) to http://www.opalesurfcasting.net/IMG/truite_et_saumon_atlantique.jpg .

Checking the cache for this image in "/pagespeed_admin/cache#show_metadata" gives :`

Metadata cache key:rname/aj_pBqmmsYhv0uO-3TvxlU9/http://www.opalesurfcasting.net/IMG/truite_et_saumon_atlantique.jpg@@_ cache_ok:false can_revalidate:false partitions:

When IPRO is ON, refresh the page in FIREFOX with cache disabled rewrite the image and "/pagespeed_admin/cache#show_metadata" gives cache_ok:true, but "waiting for completion in url" still occurs.

Sometimes "waiting for completion of URL" happened too with a javascript (SIG Layer, openlayers/geoportail API) that loaded a lot of images from remote host (not added to legacy domain - not in ModPagespeedDomain ).

I will try with this javascript excluded for mod_pagespeed.

image-search rank : I'm not sure. But maybe duplicate content for images or ... ?

Greatings,

Eric

pagespeed_admin/config : Version: 13: on

Filters cw Collapse Whitespace gp Convert Gif to Png jp Convert Jpeg to Progressive jw Convert Jpeg To Webp pj Convert Png to Jpeg hw Flushes html ci Inline Css io In-place optimize for browser js Jpeg Subsampling rj Recompress Jpeg rp Recompress Png rw Recompress Webp rc Remove Comments cf Rewrite Css jm Rewrite External Javascript jj Rewrite Inline Javascript cp Strip Image Color Profiles md Strip Image Meta Data

Options EnableCachePurge (euci) True EnableRewriting (e) 1 FileCacheCleanIntervalMs (afcci) 3600000 FileCacheInodeLimit (afcl) 500000 FileCachePath (afcp) /var/cache/mod_pagespeed/ FileCacheSizeKb (afc) 2048000 InPlaceResourceOptimization (ipro) False LogDir (ald) /var/log/pagespeed RewriteLevel (l) Optimize For Bandwidth SslCertDirectory (assld) /etc/ssl/certs StatisticsLogging (asle) True

Invalidation Timestamp: Tue, 05 May 2015 00:58:30 GMT (1430787510638)