PILLUTLAAVINASH / google-enterprise-connector-manager

Automatically exported from code.google.com/p/google-enterprise-connector-manager
0 stars 0 forks source link

Two threads use the same instance of QueryTraversalManager at the same time #10

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

See below. 

What is the expected output? What do you see instead?

See https://www.vizdom.com/google/crawllog-20061229.zip

Two threads use the same instance of QueryTraversalManager at
the same time. Each thread calls resumeTraversal on the same
checkpoints. Sometimes one of the threads is interrupted between
the call to resumeTraversal and the processing the ResultSet.
Sometimes, as in crawllog-20061229.log, both threads read through
the result sets and essentially index the content twice. This
only happens with resumeTraversal. The processing of
startTraversal and the ResultSet it returns is always
single-threaded, so far as I've seen. It appears that checkpoint
is also called from a single thread, but there are a few
exceptions. It's possible that the exceptions are cases where
another worker thread was used after a pause waiting for new content.

If we look at the output of something like

       grep -B 1 "CHECKPOINT\|RESUME\|google:docid"

we see the calls to checkpoint, resumeTraversal and the docid of
the nodes as retrieved from the PropertyMap, along with the
threads for all of these calls. Here's a snippet around the first
calls to resumeTraversal, where you can see that both threads
call resumeTraversal and start processing the results:

  <thread>13</thread>
  <message>COLUMN: google:docid = 1438644</message>
--
  <thread>13</thread>
  <message>CHECKPOINT: 2006-12-01 10:55:53,1438644
@24124380</message>
--
  <thread>13</thread>
  <message>RESUME: 2006-12-01 10:55:53,1438644 @24124380</message>
--
  <thread>13</thread>
  <message>COLUMN: google:docid = 1438745</message>
--
  <thread>14</thread>
  <message>RESUME: 2006-12-01 10:55:53,1438644 @24124380</message>
--
  <thread>13</thread>
  <message>COLUMN: google:docid = 1438543</message>
--
  <thread>14</thread>
  <message>COLUMN: google:docid = 1438745</message>
--
  <thread>14</thread>
  <message>COLUMN: google:docid = 1438543</message>

The @24124380 in these messages is just the identity hash code of
the QTM instance that the calls were made on. There is only one
instance.

Please use labels and text to provide additional information.

See https://www.vizdom.com/google/crawllog-20061229.zip

Original issue reported on code.google.com by donald.z...@gmail.com on 24 Jan 2007 at 12:04

GoogleCodeExporter commented 8 years ago
Brian, please add to backlog.

Original comment by donald.z...@gmail.com on 24 Jan 2007 at 12:05

GoogleCodeExporter commented 8 years ago
Google Bug #243982

Original comment by vjo...@gmail.com on 9 Feb 2007 at 4:16

GoogleCodeExporter commented 8 years ago
Fixed in r283.  Solution is to synchronize QueryTraverser.runBatch() call so 
that
only one thread at a time can enter this method and hence can touch
queryTraversalManager.

Original comment by donald.z...@gmail.com on 28 Apr 2007 at 12:48

GoogleCodeExporter commented 8 years ago
I'm running r292, and I'm still seeing multiple calls to resumeTraversal with 
the
same checkpoint from multiple threads. 

  <thread>14</thread>
  <message>START @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>14</thread>
  <message>RESUME: 2005-05-24 09:51:36,2760 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>15</thread>
  <message>RESUME: 2007-02-03 11:53:44,30474 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>18</thread>
  <message>RESUME: 2007-02-03 11:53:44,30474 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>19</thread>
  <message>RESUME: 2007-02-03 11:53:44,30474 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>14</thread>
  <message>RESUME: 2007-02-03 11:53:44,30474 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>15</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>18</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>19</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>20</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>21</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>22</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>23</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>24</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>25</thread>
  <message>RESUME: 2007-02-03 11:57:09,30931 @8f9a32</message>
  <message>RESULTSET: 100 rows. @8f9a32</message>
--
  <thread>14</thread>
  <message>RESUME: 2007-02-03 12:01:05,30195 @8f9a32</message>
  <message>RESULTSET: 29 rows. @8f9a32</message>

Only one thread processes the result sets (although the processing thread 
varies over
time). All of the calls to checkpoint are made by the single thread that is
processing the result set. I don't know what is happening to the other threads
between calling resumeTraversal and processing the result set. Presumably they 
are
interrupted, but nothing is being logged to show that. I have attached a Zip 
file
containing the excerpt above and the full log.

I included the cycling through empty result sets partly just because so many 
threads
we involved, and it does it so quickly that it seems a little unreasonable.

Original comment by jl1615@gmail.com on 3 May 2007 at 12:00

Attachments:

GoogleCodeExporter commented 8 years ago

Original comment by mgron...@gmail.com on 3 Oct 2007 at 10:49