Open pPanda-beta opened 3 years ago
For costly io operations of 'list' on some cloud storage this sequential approach is becoming bottleneck. The following lines of code discovers partitions in a sequential and recursive manner.
https://github.com/prestosql/presto/blob/88d5d90aa147e1c170eb3e3d0fa3ab74c5a59d67/presto-hive/src/main/java/io/prestosql/plugin/hive/procedure/SyncPartitionMetadataProcedure.java#L168-L171
This can be easily parallelized by either using
Suggestions for the basic refactoring, we can start with parallelStream() instead of stream()
parallelStream()
stream()
/keep_alive
/keep_fresh
For costly io operations of 'list' on some cloud storage this sequential approach is becoming bottleneck. The following lines of code discovers partitions in a sequential and recursive manner.
https://github.com/prestosql/presto/blob/88d5d90aa147e1c170eb3e3d0fa3ab74c5a59d67/presto-hive/src/main/java/io/prestosql/plugin/hive/procedure/SyncPartitionMetadataProcedure.java#L168-L171
This can be easily parallelized by either using
Suggestions for the basic refactoring, we can start with
parallelStream()
instead ofstream()