eh3rrera / ocpj8-book

Study guide for the Oracle Certified Professional, Java SE 8 Programmer Exam (1Z0-809)
Other
131 stars 91 forks source link

chapter 17: map returns stream, and does NOT see List<T> but <T> #53

Closed txpokey closed 6 years ago

txpokey commented 6 years ago

On pp. 408-409, the following code:

Stream.of('a', 'b', 'c', 'd', 'e') .map(c -> (int)c) .forEach(i -> System.out.format("%d ", i)); The output: 97 98 99 100 101

is discussed in two contexts, 1st with a single stream and then with a merged stream. In the later usecase, the claim is that a flatmap() is required because the map sees List<T> and not <T>.

The problem is that when map applied in the 1st case, it has a functional descriptor of Character -> Integer; much different than when its applied to the merged Streams. But if you re-read the discussion, it's just vague enough to led one to think that map is seeing a Stream of lists in both contexts. Regardless, the passage misspeaks whenever its claiming that map is NOT returning Stream<T>.

The net result undermines the usecase for flatmap.

Here are direct quotes that are at issue:

From its signature (and their primitive versions' signature) we can see that in contrast to map() (that returns a single value), flatMap() must return a Stream.

And the book continues, incorrectly claiming:

And we want to convert all the characters to their int representation, we can't use map() anymore: stream .map(c -> (int)c) Because (as each element of the stream is passed to map ) c represents an object of type List<Character> , not Character .

I've attached the code I put together to explore this issue.

I refactored that code using JetBrains Idea 2018.1.1, openjdk version "1.8.0_152", on Linux Mint 18 Linux 4.4.0-21-generic #37-Ubuntu SMP. The code clearly showed the map in the 1st context is operating within a Stream<Character> and supplying a Stream<Integer>.

I confirmed that I am using the 2018-1-10 version of the book on Leanpub.

HTH

CollectionsExplorationWithStreamsPeekAndMap.zip

txpokey commented 6 years ago

Also, should add the the following sentence in that same section needs correcting, as it also claims map() returns a single element and not a stream, when in fact, map() does return a Stream<T>

This way, with you can convert a Stream<List<Object>> to Stream<Object> . However, the important concept is that this method returns a stream and not a single element (as map() does).

And as with my entire commentary here, I think the more accurate way to parameterize these types is by using <T> and not <Object>. technically <Object> not really type safe, where as the <T> notation and syntax is specifically designed for type safe parameterized operations.

eh3rrera commented 6 years ago

Hi Duncan,

Thanks for taking the time to comment about this.

I agree, those paragraphs need a rewrite, and about your second comment, you're right, it should be T and not Object. However, I'm not sure I'm following you on the rest of the points you made.

I think I didn't express myself correctly, which is the cause of the confusion.

The sample code used with map isn't exactly the same code than the flatMap example, so I don't think the same code is discussed in the two contexts. What I wanted to show is that the map method cannot be used in the second example.

From its signature (and their primitive versions' signature) we can see that in contrast to map() (that returns a single value), flatMap() must return a Stream.

Actually, I was referring to the instance of Function that map() and flatMap() take as a parameter:

map(Function<? super T,? extends R> mapper)
flatMap(Function<? super T,
   ? extends Stream<? extends R>> mapper)

So I can change that paragraph to something like:

From the type of the parameter flatMap() takes (and their primitive versions) we can see that in contrast to map() (the Function it takes returns a single value), flatMap's Function must return a Stream.

Next, about:

In the later usecase, the claim is that a flatmap() is required because the map sees List T> and not T>.

Well, yes. In the second example, the stream reference contains a stream of type List<Character> so the following code is invalid, we cannot cast List<Character> to int:

stream.map(c -> (int)c)

So I don't understand why you say the following is incorrect:

And we want to convert all the characters to their int representation, we can't use map() anymore: stream .map(c -> (int)c) Because (as each element of the stream is passed to map ) c represents an object of type List , not Character .

In addition to changing the paragraph From the type of the parameter... (well, if you think that's ok), what other changes do you suggest to make the text clearer and unambiguous?

Thanks!

txpokey commented 6 years ago

I'm really glad to hear from you and I hope I can help clarify this a bit further, now that I saw your instant reply.

As to the quote:

And we want to convert all the characters to their int representation, we can't use map() anymore: stream .map(c -> (int)c)

it's not accurate in context, as map() is again used -- for exactly the same purpose -- once flatMap() has done its job with the new input data scenario.

So the main part of the lingering problem is I believe you need to abandon the choice of words in claiming "one object" is being returned in (e.g.) map(). In reality, a single object is being returned in either case; but very different type signatures.

(BTW: this "one object" claim also appears in the "Key Points" section at chapter's end. I just noticed that.)

Here's the deal: its all about identifying and tracking changes in type signatures as execution flows through that code.

  1. In BOTH cases, map() or flatMap() returns some kind of Stream<T>, but actually Stream<Integer> and Stream<Character>, respectively.
  2. In BOTH cases, map() or flatMap() takes some kind of Function as its argument: but each Function has different parameters. Very very different. Function<Character, Integer> vs. Function<List<Character>, Stream<? extends Character>>, respectively.
  3. So your original point is getting lost: i.e., in either case, there are very different class-types in each usecase's Stream immediately AFTER the stream is created. And that difference is why flatMap() is needed.

Let's take the two usecase scenarios, in an apples-to-apples comparison, by asking Intellij to use its rewrite rules to spell out the type signatures explicitly.

1st the original map() usecase code:

    public void mapUseCase() {
        Stream.of('a', 'b', 'c', 'd', 'e')
                .map(c -> (int)c)
                .forEach(i -> System.out.format("%d ", i));
    }

now I have that same code rewritten to spell out all the type signatures explicitly:

    public void mapUseCaseRefactoredForFunctionalSignatureClarity() {
        Stream<Character> characterStream = Stream.of('a', 'b', 'c', 'd', 'e');
        Function<Character, Integer> characterIntegerFunction = c -> (int) c;
        Stream<Integer> integerStream = characterStream
                .map(characterIntegerFunction);
        integerStream
                .forEach(i -> System.out.format("%d ", i));
    }

okay, now let's repeat that exact same approach for the flatMap() usecase.

    public void flatmapUseCase() {
        List<Character> aToD = Arrays.asList('a', 'b', 'c', 'd');
        List<Character> eToG = Arrays.asList('e', 'f', 'g');
        Stream.of(aToD, eToG)
                .flatMap(l -> l.stream())
                .map(c -> (int) c)
                .forEach(i -> System.out.format("%d ", i));
    }

and now that same code rewritten:

    public void flatmapUseCaseRefactoredForFunctionalSignatureClarity() {
        List<Character> aToD = Arrays.asList('a', 'b', 'c', 'd');
        List<Character> eToG = Arrays.asList('e', 'f', 'g');
        Stream<List<Character>> stream = Stream.of(aToD, eToG);
        Function<List<Character>, Stream<? extends Character>> listStreamFunction = l -> l.stream();
        Stream<Character> characterStream = stream
                .flatMap(listStreamFunction);
        Function<Character, Integer> characterIntegerFunction = c -> (int) c;
        Stream<Integer> integerStream = characterStream
                .map(characterIntegerFunction);
        integerStream
                .forEach(i -> System.out.format("%d ", i));
    }

So if this helps, please feel free to use any or all of this stuff. And please do not hesitate so ask for further clarification or discussion. I dont have an agenda, except learning this stuff.

p.s. awesome book.

eh3rrera commented 6 years ago

Thanks a lot for taking the time to explain this in a detailed way.

I see your point, thanks.

Unfortunately, I'm constrained by space (mainly due to the print version), and I cannot use all of your examples.

So how about this:

  1. I delete the following parts (in the key points section too):

    From its signature (and their primitive versions' signature) we can see that in contrast to map() (that returns a single value), flatMap() must return a Stream. ...we can't use map() anymore: stream..map(c -> (int)c) Because (as each element of the stream is passed to map) c represents an object of type List, not Character. ...However, the important concept is that the function used by this method returns a stream and not a single element (as map() does).

  2. After the flatmap example, I use an adapted version of your great summary:

    In BOTH cases, map() or flatMap() returns some kind of Stream, but actually Stream and Stream, respectively. In BOTH cases, map() or flatMap() takes some kind of Function as its argument: but each Function has different parameters. Very very different. Function<Character, Integer> vs. Function<List, Stream<? extends Character>>, respectively.

What do you think?

Thanks again!

txpokey commented 6 years ago

sounds great. (sorry I didn't get back sooner.... )

eh3rrera commented 6 years ago

No problem, I have changed the section, thanks a lot! 👍

txpokey commented 6 years ago

Thanks for working with me on these changes. I think it definitely is clearer now...

eh3rrera commented 6 years ago

Yes, it's clearer now. Thanks to you!