cfsimplicity / spreadsheet-cfml

Standalone library for working with spreadsheets and CSV in CFML
MIT License
126 stars 35 forks source link

add parallelization to queryToCSV #336

Closed vitamindck closed 10 months ago

vitamindck commented 10 months ago

added threads to csv generation. in my performance testing with 10mb files it was 2-3x faster with just two threads.

row order is not guaranteed! speed is more important than order for my use case.

vitamindck commented 10 months ago

tested on CF2021 HotFix 11

cfsimplicity commented 10 months ago

Thanks for the PR. I considered making query2csv() run with parallel threads a while ago but was put off by the issue with ordering. But making it an option seems reasonable.

There are a few issues though:

  1. Parallel iteration is only supported in Lucee and ACF2021+. The library currently still supports ACF2016 and 2018, so I've had to refactor your code to fail gracefully on those older versions.
  2. For some reason I can't get a unit test of the parallel option to work with ACF2021/23 when running the whole test suite. In isolation it's fine but Testbox starts behaving weirdly when running that test as part of the whole suite. I've left the test in but disabled it.
  3. Although Lucee runs the tests fine, it can also behave oddly with parallel loops depending on how long each iteration takes. I will need to make it clear in the docs that the option should be used with care and at the developer's own risk!

Please could you give the develop branch a try to make sure it's doing what you expect and report back?

Thanks.

vitamindck commented 10 months ago

I'll test on both tomorrow morning and let you know! Thanks!