The current implementation of the groupBy operator in Project Reactor allows for a fixed prefetch size, which is not optimal for scenarios where different group keys have varying data volumes and characteristics. This can lead to inefficient resource management and suboptimal performance.
Desired solution
I propose extending the groupBy operator to accept a function that determines the prefetch size based on the group key. This function would take the group key as an input and return an integer value representing the prefetch size for that specific group.
Example API
<K> Flux<GroupedFlux<K, V>> groupBy(Function<? super T, ? extends K> keyMapper,
Function<? super K, Integer> prefetchMapper);
keySelector: Function to extract the group key from each element.
prefetchMapper: Function to determine the prefetch size based on the group key.
Example Usage
Flux<MyData> dataFlux = ...;
dataFlux.groupBy(
MyData::getKey,
key -> {
if ("highVolumeKey".equals(key)) {
return 256; // larger prefetch for high volume key
} else {
return 32; // default prefetch for other keys
}
})
.flatMap(groupedFlux -> groupedFlux.collectList())
.subscribe(result -> System.out.println("Grouped result: " + result));
Considered alternatives
An alternative approach is to manually implement the grouping logic with custom prefetch management outside the groupBy operator. However, this approach would be more complex and less efficient compared to having built-in support within the groupBy operator.
Additional context
This feature would be beneficial for applications processing streams with highly variable group sizes and characteristics. It would provide a more flexible and optimized way to handle such scenarios, leading to better performance and resource utilization.
Motivation
The current implementation of the
groupBy
operator in Project Reactor allows for a fixed prefetch size, which is not optimal for scenarios where different group keys have varying data volumes and characteristics. This can lead to inefficient resource management and suboptimal performance.Desired solution
I propose extending the
groupBy
operator to accept a function that determines the prefetch size based on the group key. This function would take the group key as an input and return an integer value representing the prefetch size for that specific group.Example API
keySelector
: Function to extract the group key from each element.prefetchMapper
: Function to determine the prefetch size based on the group key.Example Usage
Considered alternatives
An alternative approach is to manually implement the grouping logic with custom prefetch management outside the
groupBy
operator. However, this approach would be more complex and less efficient compared to having built-in support within thegroupBy
operator.Additional context
This feature would be beneficial for applications processing streams with highly variable group sizes and characteristics. It would provide a more flexible and optimized way to handle such scenarios, leading to better performance and resource utilization.