Closed GoogleCodeExporter closed 8 years ago
I am trying understand how this will actually get used,
in order to better design an implementation.
Specifically, I am trying to understand how this fits
into a model that includes:
- continuous traversal (as we have now)
- paused traversal (as I just implemented)
- retraverse from scratch (the poorly-named restartConnectorTraversal)
I want to avoid a scenario where when "traverse once"
is done, it is difficult to "catch up" traversal at a
later date (without retraversing from scratch).
I am currently leaning toward the following design:
If I get a schedule that has the retryDelay as -1
(but not disabled), I will traverse from the current
checkpoint until no new content is found, then
automatically set disabled.
Re-enabling the schedule at a future time (but leaving
the retryDelay at -1) will allow the traversal to
"catch up" with content added/deleted/modified since
the last traversal. Again, once it reaches the end,
it would automatically reset to disabled.
I can imagine a UI control for the Schedule that is
a combo-box with the options like this:
Traverse: "Continuous"
"Once"
"Catch Up"
"Paused"
"Continuous" would specify a retryDelay >= 0.
"Once" might call restartConnectorTraversal to
force a traversal to start at the beginning, then
set the schedule with retryDelay == -1.
"Catch Up" would re-enable the schedule, also with
retryDelay == -1.
"Paused" would disable the schedule.
Specifying re-traversal from the beginning of a repository
seems a bit buried here, so I would like to think up a
better way to present it.
Original comment by Brett.Mi...@gmail.com
on 17 Mar 2009 at 5:17
Fixed 24 March 2009 in Connector Manager revision r1614
Change Log:
----------
M
projects/connector-manager/source/java/com/google/enterprise/connector/traversal
/QueryTraverser.java
- differentiate between waiting at end of traversal vs waiting after transient errors.
M
projects/connector-manager/source/java/com/google/enterprise/connector/traversal
/Traverser.java
- differentiate between waiting at end of traversal vs waiting after transient errors.
- define a retry interval for transient error waits.
M
projects/connector-manager/source/java/com/google/enterprise/connector/servlet/S
etSchedule.java
- allow negative retryDelayInterval specification.
M
projects/connector-manager/source/java/com/google/enterprise/connector/scheduler
/TraversalScheduler.java
- differentiate between waiting at end of traversal vs waiting after transient errors.
- if traversal reached end of repository and retryDelay is -1, pause the schedule.
- support new HostLoadManager.connectorFinishedTraversal() interface.
M
projects/connector-manager/source/java/com/google/enterprise/connector/scheduler
/HostLoadManager.java
- connectorFinishedTraversal() now takes number of milliseconds to wait.
- connectorNameToFinishTime map now stores the finish waiting time rather than
the finish traversal time.
M
projects/connector-manager/source/java/com/google/enterprise/connector/scheduler
/Schedule.java
- define POLLING_DISABLED constant for retryDelayMillis that indicates that
continuous polling is not desired.
- fix Schedule string parsing to recognize negative retryDelayMillis values.
- add setters for most members.
M
projects/connector-manager/source/javatests/com/google/enterprise/connector/sche
duler/HostLoadManagerTest.java
- support new HostLoadManager.connectorFinishedTraversal() interface.
Original comment by Brett.Mi...@gmail.com
on 31 Mar 2009 at 9:23
Original comment by Brett.Mi...@gmail.com
on 16 May 2009 at 9:07
Original issue reported on code.google.com by
jl1615@gmail.com
on 6 Mar 2009 at 9:11