PILLUTLAAVINASH / google-enterprise-connector-manager

Automatically exported from code.google.com/p/google-enterprise-connector-manager
0 stars 0 forks source link

Feed file grows without limit #86

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Configure the teedFeedFile in applicationContext.properties.
2. Index a modest-sized repository.

What is the expected output? What do you see instead?

There is no limit on the size of the feed file, so it will grow until a file 
system limit is reached. 
Several customers have reported files over 1.5 GB. It can be difficult to open 
this file for 
diagnostics (especially on Windows where no built-in tools are available).

Please use labels and text to provide additional information.

A possible workaround for this issue will be available in the next release of 
Connector Manager. 
A fix for bug 1067565 supports rotating logs of the URL and metadata without 
the content.

Original issue reported on code.google.com by jl1615@gmail.com on 6 May 2008 at 12:53

GoogleCodeExporter commented 8 years ago
Workaround submitted.  I'll let you update the status.

------------------------------------------------------------------------
r783 | mgronber | 2008-05-06 10:51:38 -0700 (Tue, 06 May 2008) | 63 lines

Feature Request 1067565 - Connector Manager: teedFeedFile should not log data
(content) that is being sent to the GSA, only log url + metadata. 

To aid in investigating connector feed issues this new feature on the DocPusher
does two things:

1. If the teedFeedFile property is specified in the
   applicationContext.properties file, the associated file will now be created
   automatically (if possible).  Previous behavior only allowed the teed stream
   to be appended to an existing file.

2. In addition to the teedFeedFile, the DocPusher now supports a separate Feed
   Log File.  Similar to the teedFeedFile, customers and developers can use this
   functionality to observe the feed record and metadata information the
   connector manager sends to the GSA.  The main difference is it is logged
   using a rolling FileHandler so the log file size can be controlled, and the
   log record will contain the feed record WITHOUT the content data.

There are a couple of ways to enable the Feed Log File.

1. Using the 'feedLoggingLevel' property.

   a. Edit the applicationContext.properties file deployed in the
      webapps/connector-manager/WEB-INF/ directory and set the
      'feedLoggingLevel' property to ALL: 

# This property controls the logging of the feed record to a log file.  The log
# record will contain the feed XML without the content data.  Set this property
# to ALL to enable feed logging, OFF to disable.  Customers and developers can
# use this functionality to observe the feed record and metadata information
# the connector manager sends to the GSA.
feedLoggingLevel=ALL

   b. Restart Tomcat.

   Feed Log records should start appearing in the
   $CATALINA_BASE/logs/google-connectors.feed%g.log file as they are sent to the
   GSA. 

2. Using a logging.properties configuration file.

   a. Edit the logging.properties file currently being used by the Connector
      Manager and add the following: 

       com.google.enterprise.connector.pusher.DocPusher.FEED_WRAPPER.FEED.level=FINER

   b. Restart Tomcat.

Original comment by mgron...@gmail.com on 6 May 2008 at 6:03

GoogleCodeExporter commented 8 years ago
TeedFeedFile should be used to accurately represent what has been feed to the 
GSA and
therefor this file should probably not be altered.  If the user just wants to 
log
some information related to feeds and are concerned about disk space they 
should use
the new  Feed Logger.

Original comment by mgron...@gmail.com on 18 Jul 2008 at 9:31