What steps will reproduce the problem?
1. Create a connector instance and let it index everything.
2. Delete the connector instance.
3. Recreate the connector instance.
What is the expected output? What do you see instead?
Step 2 should delete the connector state from the Connector Manager, so step 3
should call
startTraversal, and restart the indexing process from the beginning. Instead,
it resumes from
where the original connector instance left off.
What version of the product are you using? On what operating system?
SharePoint connector 1.1.2
Please provide any additional information below.
There are two Connector Manager bugs that are affecting this behavior. CM issue
87 leaves the
state file behind when the connector is deleted. CM issue 94 causes the
Connector Manager to
always call startTraversal and never call resumeTraversal, because the
SharePoint connector
returns null from checkpoint.
Nevertheless, this is still a SharePoint bug, because the contract for
startTraversal is not obeyed.
That is, the SharePoint implementation of startTraversal is a workaround for CM
issue 94, but a
better workaround is available. If checkpoint were to return the empty string
instead of null, then
the Connector Manager would correctly call resumeTraversal for subsequent
batches. Then
startTraversal could be changed to flush the state and start the traversal over
again.
Note: I am not completely sure that the empty string will work, either, but I
suspect it will. Since
the SharePoint connector doesn't care what the value is, any non-empty string
could be used
instead.
Fixing this issue would allow us to drop the priority of CM issue 87 (deleting
a connector may
leave behind files), and avoids blocking the implementation of CM issue 25
(can't recrawl an
existing connector).
Original issue reported on code.google.com by jl1615@gmail.com on 29 May 2008 at 12:00
Original issue reported on code.google.com by
jl1615@gmail.com
on 29 May 2008 at 12:00