Closed dwreeves closed 1 year ago
Avoiding loading the CSV as an agate
table isn't currently possible afaict, but I added a config option called fast
that seeds can use to avoid the INSERTs and that should be backwards compatible with the current system; I'll leave it here for testing purposes for a bit and make it the default for version 1.6.
Very cool! Great job on this.
The issue is that the default seed materialization for dbt involves loading everything into memory in the Python runtime, and does row by row inserts. DuckDB's builtin csv behavior is significantly faster than this approach.
The appropriate solution here is to override the implementation of the seed materialization. The 2 tricky parts may be (1) avoiding not just the inserts but the csv loading entirely, and (2) backwards compatibility.