dexie / Dexie.js

A Minimalistic Wrapper for IndexedDB
https://dexie.org
Apache License 2.0
11.56k stars 642 forks source link

Saving Uint8Array seems to take way too much space? #665

Open sid-kap opened 6 years ago

sid-kap commented 6 years ago

I'm trying to use Dexie to store an object that contains 2-3 images, each 2-20 MB, in the format of Uint8Arrays. I expect that my object should be around 50-60MB total, but when I try to store it, I get the error:

Unhandled rejection: UnknownError: The serialized value is too large (size=268436288 bytes, max=267386880 bytes).

Which suggests that the object is about 280 MB.

I have a few theories for why this might be happening. Maybe one of these is the reason?

dfahlander commented 6 years ago

No, we're basically storing the object as is. There's no special handling of Blobs or Uint8Arrays. Some browsers may have bugs though. What browser are you using?

sid-kap commented 6 years ago

Firefox 57.0.3

sid-kap commented 6 years ago

So I think I'm getting closer to the issue. Looks like myArray.byteLength is 23265792 (23 MB) but myArray.buffer.byteLength is 268435456 (268 MB). The Array itself is just pointing to a subset of the ArrayBuffer (via byteOffset).

Could this be an IndexedDB problem, where it's trying to store the whole underlying ArrayBuffer rather than just the desired subset?

sid-kap commented 6 years ago

Yep, looks like the big ArrayBuffer was the problem. Here's an example to reproduce:

import Dexie from "dexie"

import "./index.html"

class MyDatabase extends Dexie {
    myTable: Dexie.Table<State, number>

    constructor () {
        super("MyDatabase")
        this.version(1).stores({
            myTable: '++id, array',
        })
    }
}

interface State {
    id?: number,
    array: Uint8Array,
}

const db = new MyDatabase()

main()

async function main() {
    console.log("in main")
    const smallArray = new Uint8Array([1,2,3])
    const result = await db.myTable.put({id: 1, array: smallArray})
    console.log("First insert was successful with result", result)

    const buffer = new ArrayBuffer(3e8)
    const bigArray = new Uint8Array(buffer, 0, 3)
    const result2 = await db.myTable.put({id: 2, array: bigArray})
    console.log("Second insert was successful with result", result2)
}

The output is:

First insert was successful with result 1
Unhandled rejection: UnknownError: The serialized value is too large (size=300000098 bytes, max=267386880 bytes).

So it succeeds on the first one, but fails with the "too large" error on the second one. Looks like this is a Firefox bug?

dfahlander commented 6 years ago

I would suppose so, unless the the issue lies in the spec. Would be nice just to check if chrome has the same behavior. If not, I would post it on Bugzilla. They're quite fast with fixing these kinds of bugs and it seems to be easily fixed.

Another thing : do you intentionally want to index the array property? Seems like a heavy index if it contains an image.

dfahlander commented 6 years ago

@sid-kap Could this the same problem as in #667? Browser (and the IDB specification) behaves very different with indexed properties than non-indexed. Basically, indexes on binary keys will index the backing array buffer. So it may not be a bug. The issue may be that you are using a very large index. And if it is an image, there's no reason to index it.

sid-kap commented 6 years ago

You're right, I was indexing the typed array. When I get a chance I'll try it again without indexing that field and I expect that it should work.

I think the documentation of this could be improved. I I don't remember reading that I should only pass in the indexed fields into Version.stores

sid-kap commented 6 years ago

Huh, removing the index doesn't seem to fix it. Do I need to "reset" my database somehow to get rid of the index?

sid-kap commented 6 years ago

Yeah, even incrementing the version and deleting the database didn't fix it. I submitted an issue to Firefox, hopefully they will get back soon. In the mean time, I am using ArrayBuffer.prototype.slice to move my data to a smaller ArrayBuffer, which is working.

sid-kap commented 6 years ago

Also, looks like it works in Chrome, although it is very slow (took less than a second for the first insert, and took like 20 seconds for the second one). So must be a bug there as well.

dfahlander commented 6 years ago

So, to conclude, there are two issues here:

  1. Accidentially indexing large binary data where indexing should not be done. I've improved the docs and the main sample at landing page www.dexie.org to clarify not all properties should be indexed.

  2. Even when not indexing, there seems there's a common issue along implementors (Chrome and Firefox at least), which might be an issue with the IDB spec, that it stores the entire backing ArrayBuffer of ArrayBufferViews. This one can be worked around by slicing the backing ArrayBuffer to the desired size.

Keeping this issue open, since one question remains - should Dexie assist fixing these issues somehow?

sid-kap commented 6 years ago

Thanks for updating the homepage. In the comment that says "Define a schema", can you instead change it to "Define the indexed columns in the schema" or something along those lines?

I think warning about storing large indexes would be a good idea. I don't think slicing large arrays is necessary though, since that should hopefully be fixed by the browsers at some point.