Closed brittainhard closed 8 years ago
This is an independent issue. The visualizations all use the same routing key / queue for messages and the same Bokeh server document for storage. The status updates are handled between Celery and Nutch, and should be resilient under multiple crawls. I'll see if I can reproduce.
Okay, I'm going to use a different Bokeh server document for each visualization. This won't fix the fact that crawl viz. messages will go haywire if you run two simultaneous crawls, but it should also serially run subsequent crawls to behave more correctly, and we need this anyway.
I had initially coupled the Bokeh document name to the queue name when it looked like we were configuring one queue/exchange per crawl. Since we're using a routing key approach we need:
It's possible that we'll have to further restrict this based on restrictions in acceptable routing keys/document ids, but I'll try this approach for now.
Okay, I've fixed this as well as I can for now. In order to properly display any visualization on the second crawl we really need to get the routing working.
New crawls created have the same graph as a crawl that has run previously. This may be related to #736 .