gbaswath / effective-java-programming

Effective Java - 3rd Edition Source Code & its Documentation with Test Cases
MIT License
1 stars 0 forks source link

Item 45: Use streams judiciously #4

Closed gbaswath closed 2 years ago

gbaswath commented 2 years ago

Streams API in Java 8 was provided to ease the task of performing bulk operations, sequentially or in parallel.

Highlights

gbaswath commented 2 years ago

Stream API

Stream API has two key abstractions

  1. Stream - Finite / Infinite Sequence of Data Elements
  2. Stream pipeline - Multistage Computation on Stream.

Stream API is fluent. It is designed to allow all of the calls that comprise a pipeline to be chained in single expression. In fact multiple pipelines can be chained together into single expression. By default streams run in sequentially but this can be made parallel if required by invoking parallel method on any stream.

Stream

It represents finite or infinite sequence of data elements either primitive values or object references. The data elements can come from anywhere such as from collections, arrays, files, number generators, regular expression pattern matchers and other streams. Primitive values supported by stream are of type int, long & double.

Stream pipeline

It represents multi stage computation on these data elements. Stream pipeline consists of

  1. Source stream, followed by
  2. Zero or more intermediate operations and
  3. Finally one terminal operation.

Intermediate Operation

Each intermediate operation can transform the stream in some way such as

Intermediate operations all transforms one stream to another whose result types could be another stream or different from it.

Terminal Operation

It performs final operation on stream such as producing result or just printing it or storing its elements into collection.

Lazy Evaluation

Stream pipelines are evaluates lazily and its evaluation doesn't start until its terminal operation is invoked and data elements that aren't required in order to complete terminal operation are never computed. This makes to use streams effectively during infinite streams. So stream pipeline with no terminal operation is called silent no-op so we should not forget to include one as it won't execute any operations on stream due to lack of terminal operation.

gbaswath commented 2 years ago

Streams Usage

We need to write a function to categorize anagrams which are nothing but different words having same characters in jumbled form such as petals and staple where both are having same characters in different location forming two different words.

Iterative way

We can iterate each word and then sort it's characters and then we need to group it basis sorted word if alternative is found.

Streams way

We use stream from chars() and then apply intermediate operation sort() to sort characters and finally use terminal operation collect() to collect it as a String.

Streams.collect

The three parameters of the collect() function are:

  1. supplier: a function that creates a new mutable result container. For the parallel execution, this function may be called multiple times and it must return a fresh value each time.
  2. accumulator: is a stateless function that must fold an element into a result container.
  3. combiner: is a stateless function that accepts two partial result containers and merges them, which must be compatible with the accumulator function.

Ref - f5df31b

Reference

  1. Stream - Collect
  2. Stream - Collectors
gbaswath commented 2 years ago

Overuse of Streams

To sort given characters in string, we could write a method and invoke it in streams instead of doing it directly on streams. If streams are overused then it is hard to read and maintain.

General Rules

  1. We need to choose lambda parameters carefully to increase readability -> Ex: Use sb -> StringBuilder
  2. In absence of explicit types, careful naming of lambda improves readability -> Same as above
  3. Use helper methods in streams to make stream expression more succinct. -> Don't use stream to sort String characters instead invoke that function within stream.
  4. Stream pipelines lack explicit type information and named temporary variables so it is better always use helper methods. -> Same as above.

When to choose Streams

Streams can be used in best in below ways of computation

  1. Uniformly transform sequence of elements -> Ex: Transform collections
  2. Filter sequence of elements -> Ex: Filter Collections
  3. Combine sequence of elements using single operation -> Ex: Collectors Functions
  4. Accumulate sequence of elements into collection and perhaps group them by some common attribute -> Same as above
  5. Search a sequence of elements for an element satisfying certain criteria -> Ex: Filter Collections

Ref - f5df31b

gbaswath commented 2 years ago

Refrain to use chars in streams

In sortUsingStream() method which does sorting of characters in word can be done using streams sorted method as word.chars().sorted() could result in sorted characters but as IntStream since characters are represented as unicode values. This results in slower processing as they need to be converted to character once again. This is due to java lack in stream of characters. This will harm readability.

"Hello World".chars().forEach(c-> System.out.println(char) c))

When not to use Streams

If computation requires below techniques then it is not probable to choose Streams.

gbaswath commented 2 years ago

Reuse stream elements across multiple stages of stream pipeline

It is hard to reuse stream element sequence in subsequent multistage stream pipeline as previous result is available only for next stage in stream pipeline. So if in case, we are expecting first stage sequence in third or further operation, it will not be possible unless it is passed across operations.

Workaround

We can maintain a pair of each element and map it with old value for every new value and hence original value is not lost. It is also not satisfying solution as this will not work for multi stage computation. Moreover resulting code will be very messy and will not be verbose. But when it is applicable a better workaround is to invert the mapping when you need to access to earlier stage value.

Ref - eed45ba

gbaswath commented 2 years ago

Summary

Some tasks are best accomplished with streams and others with iteration. Many tasks can be accomplished by combining the two approaches. There are no hard & fast rules for choosing which approach to use for which task but basis heuristics as written above can be used to determine on which approach. If you are not sure then try both and decide it.

Ref - 9753480