nautechsystems / nautilus_trader

A high-performance algorithmic trading platform and event-driven backtester
https://nautilustrader.io
GNU Lesser General Public License v3.0
1.7k stars 400 forks source link

Order by ts_init to read row groups in order #1656

Closed twitu closed 1 month ago

twitu commented 1 month ago

Pull Request

Use ORDER BY ts_init in default sql query. Closes #1515

By default datafusion reads row groups out of order for large files. The test files use in CI are small and don't show this so it was missed. From discussion in https://github.com/apache/datafusion/issues/10572, there are two ways to fix this and using ORDER BY clause is the recommended way.

Type of change

How has this change been tested?

Tested with sample data to ensure ordering is maintained.