scriban / scriban

A fast, powerful, safe and lightweight scripting language and engine for .NET
BSD 2-Clause "Simplified" License
3.14k stars 350 forks source link

Using Scriban with very large (millions of rows) data sets results in OutOfMemoryException #54

Closed mapipolo closed 6 years ago

mapipolo commented 6 years ago

I'm evaluating if Scriban will be suitable for a project I'm working on, and am having trouble with very large data sets: millions of rows, dozens of columns, largely formatted dates and decimal values. It feels like I must be doing something wrong, because the memory usage numbers aren't at all what I would expect from the benchmarks here in the project documentation. With a test case of 10M rows and 30 columns, I get an OutOfMemoryException about halfway in... some time after crossing the 4 GB mark (as reported by Visual Studio's built-in diagnostic tools).

So two questions: 1) Could I be using the API incorrectly, resulting in this high memory usage? 2) Does Scriban provide a way to render directly to file rather than storing the render result in memory?

xoofx commented 6 years ago

The default high level API through the Template.Render doesn't provide a way to do it, but you can achieve this by using the underlying API (TemplateContext).

TemplateContext provides two methods PushOutput/PopOutput. The PushOutput method can take a IScriptOutput. By default, there is always a top level implementation in TemplateContext using the implementation StringBuilderOutput but you can also push a different output TextWriterOutput with which you can use whatever TextWriter or even write your own IScriptOutput if you want. The reason for this Push/Pop model is that you can have intermediate results using a different output (typically used by include or the capture function) that can get re-injected later.

So simply pushing a new top level output to a TemplateContext and rendering a template to it (through the TemplateContext.Evaluate(template.Page) and you should be able output at your own pace and requirements.

mapipolo commented 6 years ago

This approach is working brilliantly: peak memory use while rendering those 30M values directly to a TextWriter is measurable in kilobytes. Thank you for the direction!