Closed GoogleCodeExporter closed 9 years ago
Original comment by charliej...@gmail.com
on 4 Dec 2007 at 5:47
To clarify, the Service enters the STOP_PENDING state but never leaves it.
Something
must be going wrong in the shutdown logic in startflax.py because flaxservice.py
hasn't changed in a long time.
Original comment by charliej...@gmail.com
on 4 Dec 2007 at 5:48
Checking SVN, revision #830 works, revision #832 doesn't, which should help to
identify what goes wrong.
Original comment by charliej...@gmail.com
on 4 Dec 2007 at 6:12
well - 831 and 832 just added missing imports to startflax.py, prior to that
exceptions were being raised in the shutdown. The strange thing is that I
certainly
tested this at the time of making the change for 832 and it worked then. But
now I'm
getting the behaviour as you describe, so I'm a little confused. Anyway - I'll
take a
look and see what I can figure out.
Original comment by paul.x.r...@googlemail.com
on 4 Dec 2007 at 7:13
I've made a change to logging, and added more debugging log message. Now things
seem
to work OK for me, but there is something strange going on... I got the bug a
couple
of times, then it went away with exactly the same code.
There does seem to be a problem in that the uninstaller doesn't actually remove
all
the files, yet doesn't warn that this is so. I wonder whether this is related,
but I
can't get a consistent repetition at the moment. I'll make a new issue for this.
I'm not sure what to do about this. With the current code the problem seems to
have
gone away for me - but I don't quite see why.
Original comment by paul.x.r...@googlemail.com
on 4 Dec 2007 at 9:09
I'm wondering if there's a timing issue here, which might explain why adding
debug
helps. I'll take a look and see if I can spot it again with the debug and tie
down
where it's getting stuck.
Original comment by charliej...@gmail.com
on 4 Dec 2007 at 10:38
Certainly there could be some kind of timing issue, but it's apparently not a
simple
one - I've tried adding sleeps at different places but still can't reliably get
it to
happen. On at least one occasion the problem seemed to that we were apparently
hanging somewhere in IndexServer.join(). I did wonder a bit about the call to
syncman.shutdown and the finally block inside IndexProcess.run, containing the
call
to processing._exit_func. Maybe there's some kind of synchronisation issue
there.
Original comment by paul.x.r...@googlemail.com
on 5 Dec 2007 at 8:25
Err.. that last comment is probably a little confusing, since it refers to
somthing I
was experimenting with and isn't checked in - please diregard the stuff after
"I did
wonder...".
Original comment by paul.x.r...@googlemail.com
on 5 Dec 2007 at 9:01
OK, here's the debug from my machine once it has hung during shutdown:
2007-12-05 09:53:21,325: root: 3564: WARNING: There was a problem reading
C:\Program
Files\Flax Basic\var\flax.flx, reverting to default settings:
2007-12-05 09:54:40,529: webserver.access: 3564: INFO: 127.0.0.1 - admin
[05/Dec/2007:09:54:40] "POST /admin/options HTTP/1.1" 303 116 "" "":
2007-12-05 09:54:40,545: webserver.access: 3564: INFO: 127.0.0.1 - admin
[05/Dec/2007:09:54:40] "GET /admin/options HTTP/1.1" 200 4958 "" "":
2007-12-05 09:54:48,170: root: 3564: DEBUG: Creating stopping thread:
2007-12-05 09:54:48,170: root: 3564: DEBUG: _do_stop: Telling index_server to
stop:
2007-12-05 09:54:48,170: root: 3564: DEBUG: Stopping thread started:
2007-12-05 09:54:48,170: indexing: 3564: DEBUG: Stopping indexing_process:
2007-12-05 09:54:48,170: root: 3564: DEBUG: _do_stop: storing persistent data:
2007-12-05 09:54:48,217: root: 3564: DEBUG: _do_stop: stopping web server:
2007-12-05 09:54:51,513: webserver.errors: 3564: DEBUG: [05/Dec/2007:09:54:51]
HTTP
HTTP Server shut down:
2007-12-05 09:54:51,513: root: 3564: DEBUG: _do_stop: joining index_server:
2007-12-05 09:54:51,513: indexing: 3564: DEBUG: waiting to join
indexing_process:
2007-12-05 09:54:51,513: root: 3564: DEBUG: FlaxMain._do_stop: joined
index_server:
Original comment by charliej...@gmail.com
on 5 Dec 2007 at 9:55
Interestingly a few seconds later we get:
2007-12-05 09:55:21,342: scheduling: 3564: INFO: Checking collections for
(re-)indexing:
which shows something is still alive!
Original comment by charliej...@gmail.com
on 5 Dec 2007 at 9:56
FlaxMain.join() was returning the wrong value, this is now fixed; also we now
use
sys.exit_func() rather than processing._exit_func(), which should ensure the
various
atexit() handlers are called.
Original comment by charliej...@gmail.com
on 5 Dec 2007 at 10:37
Further problem if an indexing run has been carried out, it still doesn't
uninstall
(the indexing run may have finished). Possibly due to the web page updates? Log
follows:
2007-12-05 11:06:59,482: root: 3208: DEBUG: Creating stopping thread:
2007-12-05 11:06:59,482: root: 3208: DEBUG: Stopping thread started:
2007-12-05 11:06:59,482: root: 3208: DEBUG: _do_stop: Telling index_server to
stop:
2007-12-05 11:06:59,482: indexing: 3208: DEBUG: Stopping indexing_process:
2007-12-05 11:06:59,482: root: 3208: DEBUG: _do_stop: storing persistent data:
2007-12-05 11:06:59,513: root: 3208: DEBUG: _do_stop: stopping web server:
2007-12-05 11:06:59,732: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:06:59] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:01,763: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:01] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:03,545: scheduling: 3208: INFO: Checking collections for
(re-)indexing:
2007-12-05 11:07:03,809: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:03] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:05,842: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:05] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:07,872: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:07] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:09,920: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:09] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:11,950: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:11] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:13,997: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:13] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:16,029: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:16] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:18,059: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:18] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
2007-12-05 11:07:20,092: webserver.access: 3208: INFO: 127.0.0.1 - admin
[05/Dec/2007:11:07:20] "GET /admin/collections/json HTTP/1.1" 200 119 "" "":
Original comment by charliej...@gmail.com
on 5 Dec 2007 at 11:09
Closing the browser that shows the Collection Status page causes the uninstall
to
unlock and continue successfully.
Original comment by charliej...@gmail.com
on 5 Dec 2007 at 11:11
Original comment by paul.x.r...@googlemail.com
on 6 Dec 2007 at 10:25
Fixed by skipping calling the cherrypy.server.stop: calling this blocks
sometimes,
whereas simply terminating the process without calling it seems to work fine,
so we
don't call it. This feels unclean though, and could have side effects we
haven't yet
noticed, so I'll open a separate issue to look into this at a later date.
Original comment by boulton.rj@gmail.com
on 6 Dec 2007 at 11:57
Original issue reported on code.google.com by
charliej...@gmail.com
on 4 Dec 2007 at 5:40