Open jimmywarting opened 2 years ago
it would technically also be possible to have some kind of write operations too without loading everything into memory also if you did something like:
const pre = this._file.slice(start1, end1)
const middle = new Blob([ data ])
const end = this._file.slice(start2, end2)
this._file = new Blob([ pre, mid, end ])
creating new blobs like this with .slice()
dose not create any more memory, it only creates a new handle \w a offset and a size property
image taken from: https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit
just b/c you concatenate two of the same blob chunks new Blob([a, a])
dose not mean you will create 2x as much memory, it will only mean: read the same chunk twice
Anything is possible but I think your looking for a file api vs a sqlite api such as https://github.com/mailru/FileAPI. Using NDJSON or CSV format may also help.
Edit: not sure why this was thumbs down, maybe need to clarify the usecase? The architectural concern might be wasm.. you copy data to the wasm fs and then perform operations and the copy out. There are various projects that use indexdb such as absurd-sql, which may offload the memory once loaded, wa-sqlite let's you write your own vfs, but still requires data and wasm tango. If this is all server side a wasi based sqlite might be better, but your usecase needs to be clearer and likely beyond the scope of this project especially since it's not async. As mentioned in other issues, until pthreads are supported, adopted, and used with wasm, async really isn't async
I have loaded around 300 mb files using sql.js over the emscripten filesystem and then loaded them into a db in sub 10 seconds but that required multiple copies in mem and was fairly straight forward data.
I know you can create a sql instance of an uint8array and make operations on that. but i wonder: is it possible to just load a simple
File
instead of aUint8Array
?i would like to load a read-only database and being able to do:
The idea would just be for sql.js to slice the part of the file that is needed without loading the hole thing into memory and kind of doing:
await file.slice(start, end).arrayBuffer()
operations instead