nushell / nushell.github.io

Nushell's main website, blog, book, and more
https://www.nushell.sh/book/
MIT License
162 stars 400 forks source link

the draft pr to demonstrate the performance of `polars` lazyframes on 20240516 #1404

Closed maxim-uvarov closed 1 month ago

maxim-uvarov commented 1 month ago

this pr is not intended to be merged.

fdncred commented 1 month ago

This is my timings from my Windows desktop that is not super fast. I didn't do the python timings.

❯ timeit {open Data7602DescendingYearOrder.csv}
5sec 244ms 662µs 400ns

❯ timeit {polars open Data7602DescendingYearOrder.csv | polars collect; null}
239ms 526µs 800ns

❯ timeit {
    open 'Data7602DescendingYearOrder.csv'
    | group-by year --to-table
    | update items {|i|
        $i.items.geo_count
        | math sum
    }
}
10sec 906ms 680µs 300ns

❯ cat load.nu
let df = polars open Data7602DescendingYearOrder.csv
let res = $df | polars group-by year | polars agg (polars col geo_count | polars sum)
$res | polars collect
❯ timeit {source load.nu}
155ms 678µs 500ns
fdncred commented 1 month ago

Just FYI - Regular nushell tables can emulate the polars output too. nushell image polars plugin (had to add sort to get them in the same order) image