When multiple folders are renamed/restored and every folder contains documents
>= batch-hint, connector makes multiple batch traversal to re-crawl the
documents under these folders. During this activity, connector keeps track of
the folder path where the last batch traversal was stopped. This folder path
exists under one of the folders that have been renamed/restored. Connector,
however does not keep track of the latter i.e which folder rename was it
processing when the batch traversal got stopped. Rather, it mistakenly assumes
that the last visited folder path always belongs to the current (starting from
first) folder. If not found, it does a full crawl of the folder assuming this
is a new folder item that has been renamed/restored. Hence, if the last visited
folder path is under Folder$N$, connector re-crawls documents under Folder1 to
Folder$N$ and never proceeds to Folder$N+1$ because batch-hint is reached by
that time.
What steps will reproduce the problem?
1. Let the connector completes one full traversal and enter into change
detection mode.
2. rename more than one folders under any given library
3. make sure that a) each renamed folders have sub-folders underneath and, b)
total no. of documents under each renamed folder is a little less then
batch-hint.
What is the expected output? What do you see instead?
Connector should re-crawl all the documents under renamed folders and move on.
But, this does not happen as connector keeps on re-crawling the documents under
the renamed folders and never proceeds.
Please use labels and text to provide additional information.
Original issue reported on code.google.com by th.nitendra on 19 Oct 2010 at 8:41
Original issue reported on code.google.com by
th.nitendra
on 19 Oct 2010 at 8:41