PILLUTLAAVINASH / google-enterprise-connector-manager

Automatically exported from code.google.com/p/google-enterprise-connector-manager
0 stars 0 forks source link

HTTP 400 error thrown for correct feeds #75

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Write a Connector which crawls content very fast 
2. Configure 2 connector instances with a high traversal rate (I choose 1000)
3. look at the log: from time to time you find an IOException where the GSA
returns a HTTP 400 Error:

09.04.2008 14:20:48 com.google.enterprise.connector.pusher.DocPusher take
WARNUNG: Rethrowing IOException as PushException
java.io.IOException: Server returned HTTP response code: 400 for URL:
http://192.168.200.11:19900/xmlfeed
    at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.jav
a:1170)
    at
com.google.enterprise.connector.pusher.GsaFeedConnection.sendData(GsaFeedConnect
ion.java:128)
    at com.google.enterprise.connector.pusher.DocPusher.take(DocPusher.java:645)
    at
com.google.enterprise.connector.traversal.QueryTraverser.runBatch(QueryTraverser
.java:120)
    at
com.google.enterprise.connector.scheduler.TraversalScheduler$TraversalWorkQueueI
tem.doWork(TraversalScheduler.java:359)
    at
com.google.enterprise.connector.common.WorkQueueThread.run(WorkQueueThread.java:
83)

What is the expected output? What do you see instead?

There should not be any kind of error! I tested the disliked feeds on my
own and they work well! (I got them by adding an output directly in the
DocPusher)

What version of the product are you using? On what operating system?

GSA version 5.0.0
Connector Manager 1.0.3 revision 763

Please provide any additional information below.

I found a solution: modifying the DocPusher in the following way will fix
the Problem:
 Make the Feed Connection static and synchronize the acces to it:

...
109 private static FeedConnection feedConnection;
...
700 synchronized (feedConnection) {
701    gsaResponse = feedConnection.sendData(dataSource, feedType, is);
702 }
...

Is this way of solving the problem ok? If there is any solution without
modifying the connector manager please post it!

Original issue reported on code.google.com by andree.j...@googlemail.com on 15 Apr 2008 at 11:10

GoogleCodeExporter commented 8 years ago

Original comment by mobe...@gmail.com on 18 Apr 2008 at 10:40

GoogleCodeExporter commented 8 years ago
Thank you for reporting this issue.  I finally have a test case that can 
reproduce
this and I'm working on a solution.  As your proposed solution would indicate 
this is
a threading issue.  It turns out the DocPusher.take() method is not thread safe 
in a
couple of ways which need to be addressed.

Original comment by mgron...@gmail.com on 6 May 2008 at 5:01

GoogleCodeExporter commented 8 years ago
r787 | mgronber | 2008-05-07 10:32:20 -0700 (Wed, 07 May 2008) | 15 lines

Fix Issue 75 - Http 400 Error Thrown For Correct Feeds (when using multiple
connectors)

This was related to the DocPusher.take() method not being thread-safe.  I have
removed the use of several instance fields (xmlData, dataSource, and feedType)
from the method and moved the new feedLogRecord into a ThreadLocal.

I have also removed the lazy evaluation and potential DCL problem from the
GsaFeedConnection.

It should be noted that the retrieval of the GsaResponse instance field is not
thread-safe, however, it is only used from tests.

Tested with 2 connectors each feeding 600 documents and no problems.

Original comment by mgron...@gmail.com on 7 May 2008 at 5:35

GoogleCodeExporter commented 8 years ago

Original comment by mgron...@gmail.com on 19 Jun 2008 at 6:35