googlegsa / sharepoint.v3

Google Search Appliance Connector for SharePoint
5 stars 10 forks source link

Handle folder rename/restore cases more judiciously #174

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When multiple folders are renamed/restored and every folder contains documents 
>= batch-hint, connector makes multiple batch traversal to re-crawl the 
documents under these folders. During this activity, connector keeps track of 
the folder path where the last batch traversal was stopped. This folder path 
exists under one of the folders that have been renamed/restored. Connector, 
however does not keep track of the latter i.e which folder rename was it 
processing when the batch traversal got stopped. Rather, it mistakenly assumes 
that the last visited folder path always belongs to the current (starting from 
first) folder. If not found, it does a full crawl of the folder assuming this 
is a new folder item that has been renamed/restored. Hence, if the last visited 
folder path is under Folder$N$, connector re-crawls documents under Folder1 to 
Folder$N$ and never proceeds to Folder$N+1$ because batch-hint is reached by 
that time.

What steps will reproduce the problem?
1. Let the connector completes one full traversal and enter into change 
detection mode.
2. rename more than one folders under any given library
3. make sure that a) each renamed folders have sub-folders underneath and, b) 
total no. of documents under each renamed folder is a little less then 
batch-hint.

What is the expected output? What do you see instead?
Connector should re-crawl all the documents under renamed folders and move on. 
But, this does not happen as connector keeps on re-crawling the documents under 
the renamed folders and never proceeds.

Please use labels and text to provide additional information.

Original issue reported on code.google.com by th.nitendra on 19 Oct 2010 at 8:41

GoogleCodeExporter commented 9 years ago

Original comment by shashank...@gmail.com on 18 Mar 2011 at 12:05

GoogleCodeExporter commented 9 years ago
This issue is filed as Google issue #6513448

Original comment by tdnguyen@google.com on 18 May 2012 at 12:29