Closed GoogleCodeExporter closed 9 years ago
This sounds a lot like the difference between the following in Spring's JDBC
result set handling, which covers most of the possibilities. Worth seeing that
for prior art of line-by-line processing?
* Guava Byte/LineProcessor - boolean processLine(String), T getResult() - you
get a callback for each row/segment of data and are expected to have a returned
accumulator
* RowCallbackHandler - void processRow(ResultSet) - you get a callback for each
row and do whatever you want with accumulating, you implement getResult() if
you want it. Someone else controls looping. [1]
* RowMapper - T mapRow(ResultSet, int rowNum) - you get a callback for each row
and return an object for that row. Effectively getResult() returns a List<T> of
the mapped objects. [2]
* ResultSetExtractor - T extractData(ResultSet) - generalised version of the
others. Your implementation does everything: controls looping, gathers and
returns results. [3]
The other very general option is for the row callback to be provided with a
clientObject/accumulator, which maybe was created in the processor as well.
(This is very foldLeft.) You start needing a few generic types for the type of
the accumulator, maybe something else for the ultimately returned type (think
ImmutableList.Builder accumulator -> ImmutableList, for example) and a few
methods like createAccumulator, postProcess(Accumulator) (to build the
ImmutableList, for example). This provides a lot of flexibility but is likely
much more of a change than is warranted.
Are you talking about just removing getResult()? For my part I'd be happy to
take the change, but in such a small interface it seems not likely to be that
big a deal to fix.
[1] -
http://static.springsource.org/spring/docs/3.0.x/javadoc-api/org/springframework
/jdbc/core/RowCallbackHandler.html
[2] -
http://static.springsource.org/spring/docs/3.0.x/javadoc-api/org/springframework
/jdbc/core/RowMapper.html
[3] -
http://static.springsource.org/spring/docs/3.0.x/javadoc-api/org/springframework
/jdbc/core/ResultSetExtractor.html
Original comment by joe.j.kearney
on 2 Nov 2012 at 7:02
> Are you talking about just removing getResult()? For my part I'd be happy to
take the change, but in such a small interface it seems not likely to be that
big a deal to fix.
Yep, basically just changing the interfaces to:
interface ByteProcessor {
boolean processBytes(byte[] buf, int off, int len) throws IOException;
}
interface LineProcessor {
boolean processLine(String line) throws IOException;
}
Anything else can be built on top of those pretty easily, so I don't think I'd
want to go down the road of providing narrower functionality like RowMapper
unless it was clear that mapping individual lines to objects was the most
common use case.
Original comment by cgdecker@google.com
on 2 Nov 2012 at 7:13
Proposed change is quite clean when accumulator T is of type Iterable<E> but we
use LineProcessor with Files.readLines(File file, Charset charset,
LineProcessor<T> callback) method which will be harder to implement for you (if
possible) and for use if you change the interface.
I mean, in our case, getResult method builds some more complex object based on
few private instance variables some of which change state and, since they're
implementation detail of concrete LineProcessor subclass, they should not leak
outside as suggested in this issue. This code:
CacheBuilder.newBuilder().build(new CacheLoader<String, AnnouncementsData>() {
@Override
public AnnouncementsData load(final String indexVersion) throws Exception {
return Files.readLines(new File(getAnnouncementsPath(indexVersion)),
Charsets.UTF_8, new AnnouncementsLineProcessor());
}
plus AnnouncementsLineProcessor class definition with
ImmutableList.Builder<String>, String and SomeObj private variables (plus
boolean flag which is used for some optimization and is enteirly an
implementation detail) would turn into this:
CacheBuilder.newBuilder().build(new CacheLoader<String, AnnouncementsData>() {
@Override
public AnnouncementsData load(final String indexVersion) throws Exception {
// these were AnnouncementsLineProcessor instance variables
final ImmutableList.Builder<String> a = ImmutableList.builder();
final StringBuilder b = new StringBuilder();
// assignment won't work - c is final, so I have to use some wrapper here
final SomeObjWrapper c;
// this flag leaks out of LineProcessor impl and can't be simple boolean since is final
final AtomicBoolean checked = AtomicBoolean(false);
Files.readLines(new File(getAnnouncementsPath(indexVersion)),
Charsets.UTF_8, new LineProcessor() {
@Override
public boolean processLine(final String input) throws IOException {
/* must be inline */
} });
return new AnnouncementsData(a.build(), b.toString(), c.get());
}
I am not saying it's not possible for me to change my LineProcessor class, but
having getResult method has pros - like hiding implementation concrete
LineProcessor details and Filles.readLines method - in exchange for one more
method in API. Correct me if I'm wrong but I guess it's a discussion of
advantages and disadvantages of few different line-by-line processing
philosophies just as joe.j.kearney wrote.
Original comment by xaerx...@gmail.com
on 12 Nov 2012 at 12:51
@xaerxess: Keep in mind that you could keep your AnnouncementsLineProcessor
pretty much as it is now, just with getResult() not being part of the
interface. You'd just need to declare it as a variable before calling
readLines, then call getResult() on it yourself afterwards.
Original comment by cgdec...@gmail.com
on 12 Nov 2012 at 1:57
Yeah, I guess I've overlooked this solution... Then it would be cleaner API but
slightly more dirty usage:
CacheBuilder.newBuilder().build(new CacheLoader<String, AnnouncementsData>() {
@Override
public AnnouncementsData load(final String indexVersion) throws Exception {
final AnnouncementsLineProcessor processor = new AnnouncementsLineProcessor(); // cannot use LineProcessor here
Files.readLines(new File(getAnnouncementsPath(indexVersion)), Charsets.UTF_8, processor);
return processor.getResult();
}
but seems reasonable for me now.
Probably it's not the best place to ask but would it be a good idea (just for
my code) to create marker interface extending LineProcessor with getResult
method (or also extending Supplier<T> maybe?) if you changed the API and I
wanted to program to an interface instead of class?
Original comment by xaerx...@gmail.com
on 12 Nov 2012 at 7:18
This issue has been migrated to GitHub.
It can be found at https://github.com/google/guava/issues/<id>
Original comment by cgdecker@google.com
on 1 Nov 2014 at 4:13
Original comment by cgdecker@google.com
on 1 Nov 2014 at 4:18
Original comment by cgdecker@google.com
on 3 Nov 2014 at 9:08
Original issue reported on code.google.com by
cgdecker@google.com
on 2 Nov 2012 at 4:43