Closed paoliniluis closed 6 years ago
Yes, that is an artifact from early versions. 1GB files worked well originally for our particular data size/partitioning strategy. AWS does recommend 256MB files — but Spectrum will now maximize parallelism with any sized data files!
If you’d like to open a PR to move the default to 256MB, that would be much appreciated!
Done, check PR https://github.com/hellonarrativ/spectrify/pull/31
Thank you @paoliniluis !
Hey! Great tool. Why are you using 1gb files instead of smaller ones? Like 100mb... Smaller files should be better to get more speed in queries since they get more parallelism