googlegsa / sharepoint.v3

Google Search Appliance Connector for SharePoint
5 stars 10 forks source link

Connector stops responding while saving and loading state file. #132

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Authentication requests to connector timeout and that causes the GSA to
throw 500 errors. It is observed that this happens when connector tries to
save the state file to disc.

Steps to reproduce:
* Use the connector to crawl a large repository, 2 million+ documents with
10000+ sites.
* Ensure connector state file has grown large enough, 50MB+
* The connector saves the state file whenever checkpoint is called. It
spends 30min+ for this operation. See log snippet below.
* During this period the processor utilization increases significantly and
connector-manager stops responding to Authentication requests and Admin
console requests. (Note that GSA should not send the authentication request
to connector. A separate bug has been filed on the GSA to address that.)

Expected behavior:
- Ideally tomcat should be able to handle multiple simultaneous requests.
- Connector state-file persistence speed needs significant improvement
- Processor utilization during state file construction and persistence
should be minimal and not block other activities.

Log Snippet:
Dec 17, 2009 7:25:16 AM [Traverse production-sp-con-instance-name]
com.google.enterprise.connector.sharepoint.spiimpl.SPDocumentList nextDocument
INFO: Sending DocID [ 123 ], docURL [
http://abc.com:80/sites/XYZ/Lists/Test List/DispForm.aspx?ID=123 ] to CM
for ADD.
Dec 17, 2009 7:57:22 AM [Traverse production-sp-con-instance-name]
com.google.enterprise.connector.sharepoint.state.GlobalState saveState
INFO: saving state to
/apps/GoogleConnectors/production/Tomcat/webapps/connector-manager/WEB-INF/conne
ctors/sharepoint-connector/production-sp-con-instance-name/Sharepoint_state.xml
Dec 17, 2009 7:57:25 AM [Traverse production-sp-con-instance-name]
com.google.enterprise.connector.sharepoint.spiimpl.SharepointTraversalManager
setBatchHint
INFO: BatchHint Set to [ 500 ]
Dec 17, 2009 7:57:25 AM [Traverse production-sp-con-instance-name]
com.google.enterprise.connector.sharepoint.spiimpl.SharepointTraversalManager
resumeTraversal

Original issue reported on code.google.com by j.dars...@gmail.com on 18 Dec 2009 at 9:46

GoogleCodeExporter commented 9 years ago

Original comment by j.dars...@gmail.com on 18 Dec 2009 at 9:51

GoogleCodeExporter commented 9 years ago
Check the following thread for details:

http://code.google.com/p/google-enterprise-connector-sharepoint/issues/

Original comment by rakeshs101981@gmail.com on 21 Dec 2009 at 6:24

GoogleCodeExporter commented 9 years ago
Connector authorization should not get affected becuse of any state file
saving/loading issue. For the state file related issue refer to Issue 108 at
http://code.google.com/p/google-enterprise-connector-sharepoint/issues/detail?ca
n=1&q=108&colspec=ID%20Type%20Status%20Priority%20CustomersAffected%20Milestone%
20Owner%20Summary&id=108

Original comment by th.nitendra on 21 Jan 2010 at 1:30

GoogleCodeExporter commented 9 years ago
Verified fix on Google SharePoint connector 2.4.4

Original comment by vishwas....@gmail.com on 12 Feb 2010 at 2:10