antlr / stringtemplate4

StringTemplate 4
http://www.stringtemplate.org
Other
955 stars 231 forks source link

Working with Iterator<T> as parameter shows unexpected results. #255

Closed abego closed 4 years ago

abego commented 4 years ago

I stumbled across some code in Interpreter#length

else if ( v instanceof Iterable || v instanceof Iterator ) {
    Iterator<?> it = v instanceof Iterable ? ((Iterable<?>)v).iterator() : (Iterator<?>)v;
    i = 0;
    while ( it.hasNext() ) {
        it.next();
        i++;
    }
}

When v is an Iterator (not an Iterable) v is assigned to it and it is directly used to calculate the length, using it.next(). However that consumes the Iterator. I.e. the length method will only return the correct result with the first call, after that it will return 0 for the same Iterator. Not something I would have expected.

This made me thinking how Iterators behave in "normal" template applications.

As it turns out, the "consuming" aspect of Iterator#next results in strange results here too.

E.g. assume this case

List<String> names = new ArrayList<String>();
names.add("Ter");
names.add("Tom");
names.add("Sriram");

ST e = new ST(
    "[<length(names)> items] <names>"
);
// NOTE: don't pass in the List, but an Iterator on the list
e.add("names", names.iterator());

If you now render the ST e I would have expected

[3 items] TerTomSriram

However the current code return [3 items].

If we change the template and make the length call follow the template apply, i.e.

ST e = new ST(
    "<names> [<length(names)> items]"
);

I would have expected

TerTomSriram [3 items]

however I get TerTomSriram [0 items]

I was just wondering if you are aware of this behaviour.

I think it is possible to "cure" this and make the behaviour more "expected". However it would involve some caching, proxying etc. that may result in performance hits (and incompatibility with old outputs).

If you are interested in fixing this issue I would be glad to help.

BTW: you will find three new test cases covering this issue in the "iterator" branch of my StringTemplate4 fork (https://github.com/abego/stringtemplate4/tree/iterator)

parrt commented 4 years ago

Hi! Yeah,Not the best behavior but consumption is a side effect of iteration. Is there a way to go through an iterator without consuming / moving the cursor?

abego commented 4 years ago

Is there a way to go through an iterator without consuming / moving the cursor?

No, there is not way.

If we want to fix this issue I thought of introducing a "proxy" for the Iterator we get from outside. Here the rough idea:

The proxy implements the "Iterator" interface and gets the original Iterator X as input. The first time the proxy's next is called it will create a List and fill the list with the items of X, thus consuming it. The real iteration then uses a "new" Iterator of the List.

The first time we encounter an Iterator we create it's proxy and use that proxy instead of the Iterator. There is also a "cache", a (weak) map with the original Iterator as key and the proxy as value so we know if there is already a proxy for a specific Iterator. If we need to work on an Iterator that has already a proxy we ask the proxy for a "new Iterator" and use that instead of the original Iterator. (As explained, the "new" Iterator we get from the proxy is an Iterator on the proxy's internal List.)

Now we need to check for places that may work on "outside" Iterators, so we can apply the proxy approach appropriately. The first one I know of is the length method in the Interpreter. Another one is probably convertAnythingIteratableToIterator. That are all places that come into my mind, but you know the code better, I guess :-)

There are some additional details one needs to cover in the implementation, e.g. the case that the same Iterator is active in more than two places at the same time. The simplest example that comes into my mind is a template like <iterator:{x|<length(iterator)> }: the "outer" iterator renders the anonymous template for each of it's items, and the inner template uses length on the same iterator. The output should be n times "n ", where n is the number of items in the Iterator.

Does that all make sense?

BTW: have you ever heard complains regarding this issue? It might be this is not really an issue because nobody is "so stupid" to pass in a Iterator as parameter. Maybe the easiest fix is just to document the issue and ask the users not to pass in Iterators.

(Just to clarify: we only need the proxy for the Iterators we get from outside. The ones we create ourself from Iterables, e.g. in convertAnythingIteratableToIterator, can be used directly, don't need a proxy).

parrt commented 4 years ago

Well,It sounds like a lot of work to fix a problem that's not really a problem. I think the documentation just needs to make it more clear that iterators are consumed when they are passed in. This is the first complaint I've heard of it, although it did bite me once ;)

abego commented 4 years ago

I think the documentation just needs to make it more clear that iterators are consumed when they are passed in.

That's fine with me.

Shall I leave this issue open, for you as a reminder? Or just close it?

parrt commented 4 years ago

Well,I guess I'm unlikely to fix it or bother thinking about a fix so let's close it. Thanks for pointing this out though. Could be useful to others to know this.