rajgithub123 / google-enterprise-connector-sharepoint

Automatically exported from code.google.com/p/google-enterprise-connector-sharepoint
0 stars 0 forks source link

Connector should allow users to customize the rate at which discovery of new site collections is initiated #89

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a list of site collections in SharePoint which have large data 
(sub -sites, lists, docs, folders) under each one of them
2. Deploy Google Services on SharePoint
3. Create a connector instance
4. The traversal should span several hours, even in days
5. Create a new site collection mid-way when the connector is still 
crawling the existing list of site collections

What is the expected output? 
The new site collection should be discovered ASAP.

What do you see instead?
If the crawl cycle takes several days to complete, the discovery of site 
collection takes several days and the user has no clue or control as to 
when will the connector discover and crawl it.

In such cases there should be a provision to trigger the discovery of newly 
added site collections

Original issue reported on code.google.com by rakeshs101981@gmail.com on 11 Aug 2009 at 2:12

GoogleCodeExporter commented 9 years ago
If supported say run once a day, this could well lead to a situation where new 
site 
collections are kept adding and hence the connector's first crawl cycle is 
never 
completed. That means connector will take longer time to detect changes in any 
of the 
already crawled site collections. 

Need to consider these use cases while designing the solutions for the same.

Can the discovery of site collections be a background thread running 
continuously at 
a specified rate?

Original comment by rakeshs101981@gmail.com on 11 Aug 2009 at 2:16

GoogleCodeExporter commented 9 years ago
Ideally discovery should be handled by a different thread.

Internally the connector should maintain a queue of all the documents to be fed 
to 
the GSA. Multiple threads may add entries to this queue. It will also help if 
we can 
have multiple threads for Discovery and Crawl (Change Detection). The number of 
threads for each of these, their priorities and schedules should be kept 
configurable.

This will give the user some significant control on how quickly changes need to 
be 
fed.

Also a feature of type "index now" will prove beneficial when the user wants to 
force 
feed a newly added site.

Original comment by j.dars...@gmail.com on 15 Sep 2009 at 8:22

GoogleCodeExporter commented 9 years ago

Original comment by rakeshs101981@gmail.com on 25 Sep 2009 at 2:10

GoogleCodeExporter commented 9 years ago
This issue is filed as Google issue #6513772

Original comment by tdnguyen@google.com on 18 May 2012 at 12:36