Closed sirherrbatka closed 1 year ago
@sirherrbatka: This is on my TODO list, but I didn't get around to working out how the API should look like yet. We currently use data chunks for efficiency reasons instead of the row-based API functions, so you're basically loading 1024 (default DuckDB vector size) values from of one column at a time.
https://github.com/ak-coram/cl-duckdb/blob/c5b4a776bd4ae0f6f27132da3c0c2289db74ffc7/duckdb.lisp#L372
Basically the translate-result
, translate-chunk
and translate-vector
functions are responsible for all of this (see duckdb.lisp), so maybe they could be adapted to fit your use-case. I'm also open for suggestions on what higher-level API you'd like to see for this in cl-duckdb.
Of course SQL LIMIT
and OFFSET
can be used as well to load parts of the results, but I imagine this is not what you're looking for.
Nah, melding with query itself seems a bit.
@sirherrbatka: could you please describe exactly what you're trying to achieve? Hopefully we can work something out fairly quickly :)
Right now I want to copy query results into the vellum data frame, so I can get support for parquet loading and other goodies offered duckdb. The thing is, I want to avoid wasting my memory. I will figure out what else I can do after I learn more.
Maybe you could use the lower-level API functions to directly write these results into your data frame: I recommend checking out the functions I mentioned (translate-result
, etc.) for an example on how they work. If you need additional symbols exported from the library for this just let me know.
I will try, thanks!
Ok, it works. duckdb:handle probably should be exported from the package… How are NULLs marked in cl-duckdb?
NULLs are converted to nil
, but so are boolean false
values. Please see this section of the docs:
https://github.com/ak-coram/cl-duckdb#nil-as-boolean-false-vs-nil-as-null
The check for NULL values happens in duckdb::translate-vector
via duckdb-api:duckdb-validity-row-is-valid
:
https://duckdb.org/docs/api/c/data_chunk.html#duckdb_validity_row_is_valid
Ok! I just wanted to say that I am fond of the postmodern solution to this problem, namely: ALWAYS use symbol :NULL to indicate NULL value. That's a nice solution because it sidesteps "can't tell false from null" issue you also have stumbled upon.
Hello, I am working on vellum-duckdb right now. I am wondering if there is way to not load the whole output of the query at once. My initial, proof of concept copy-from looks like this:
https://github.com/sirherrbatka/vellum-duckdb/blob/main/src/code.lisp