prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.06k stars 5.38k forks source link

[Do Not Review] Set splitsize for hadoop InputFormat to Presto max_split_size #23635

Open agrawalreetika opened 2 months ago

agrawalreetika commented 2 months ago

Description

Set splitsize for hadoop InputFormat to Presto max_split_size Details in https://github.com/prestodb/presto/issues/23608

Motivation and Context

Make splitsize configurable where hadoop InputForma library is used for split generation. Resolves https://github.com/prestodb/presto/issues/23608

Impact

Make splitsize configurable where hadoop InputForma library is used for split generation. Resolves https://github.com/prestodb/presto/issues/23608

Test Plan

Contributor checklist

Release Notes

== NO RELEASE NOTE ==
tdcmeehan commented 2 months ago

Any way to add a test case for this?

agrawalreetika commented 2 months ago

Sure, I will check on how we can add a test for this.

agrawalreetika commented 2 months ago

Sure, I will check on how we can add a test for this.

Currently added tests are mainly checking how split generation is affected if we use the Hadoop library directly.