dyedgreen / deno-sqlite

Deno SQLite module
https://deno.land/x/sqlite
MIT License
409 stars 36 forks source link

Uncaught RuntimeError: memory access out of bounds #75

Closed s-i-e-v-e closed 4 years ago

s-i-e-v-e commented 4 years ago

I get this error under fairly repeatable circumstances. The stack traces and operating systems (Windows 10, WSL1, Ubuntu) differ, but the error remains the same.

error: Uncaught RuntimeError: memory access out of bounds
    at memset (:0:509349)
    at denoRead (:0:3853)
    at syncJournal (:0:33455)
    at pagerStress (:0:131724)
    at getPageNormal (:0:36575)
    at getAndInitPage (:0:209982)
    at balance (:0:201336)
    at sqlite3BtreeInsert (:0:176808)
    at sqlite3VdbeExec (:0:83386)
    at sqlite3_step (:0:50777)
error: Uncaught RuntimeError: memory access out of bounds
    at memset (:0:509349)
    at denoRead (:0:3853)
    at getPageNormal (:0:36960)
    at getAndInitPage (:0:209982)
    at balance (:0:201336)
    at sqlite3BtreeInsert (:0:176808)
    at sqlite3VdbeExec (:0:83386)
    at sqlite3_step (:0:50777)
    at step (:0:3270)
    at DB.query (db.ts:204:31)
error: Uncaught RuntimeError: memory access out of bounds
    at memset (:0:509349)
    at denoRead (:0:3853)
    at getPageNormal (:0:36960)
    at getAndInitPage (:0:209982)
    at moveToChild (:0:171132)
    at sqlite3BtreeMovetoUnpacked (:0:174179)
    at sqlite3VdbeExec (:0:66042)
    at sqlite3_step (:0:50777)
    at step (:0:3270)
    at DB.query (db.ts:204:31)

I am trying to dump a bunch of file-based tree structures into the database for analysis purposes. As there are a couple hundred thousand rows across multiple tables, I use transactions to speed up the process. This error pops up when the file size is 25+MB and the row count is over 120,000.

This problem does not occur with in-memory databases; only those backed by a file. A pragma integrity_check will often say that the database is okay. In other cases, unique constraints are violated and there are issues with indexes that a reindex will generally fix.

dyedgreen commented 4 years ago

Thank you for submitting the issue, this sounds like a serious problem! Could you provide more details for the situations in which the error occurs? (e.g. provide a minimal example script that reproduces the error when run? Does the error only occur on Windows, or can we reproduce it on macOS/ Linux using your script, etc...)

dyedgreen commented 4 years ago

A hunch regarding what this might be: The error seems to occur in denoRead, in the memset call. That probably has to do with the file-sizes/ indices exceeding the integers there.

There is a read_bytes, which is an int; maybe it should be an sqlite_int64? (And another question is if there are issues caused by the max JS integer size here, although I think int is exhausted first).

s-i-e-v-e commented 4 years ago

Could you provide more details for the situations in which the error occurs? (e.g. provide a minimal example script that reproduces the error when run? Does the error only occur on Windows, or can we reproduce it on macOS/ Linux using your script, etc...)

I have to see if I can write a script that reproduces my workload. Thing is, the first phase of the workload involves reading thousands of encrypted files, decrypting them, computing hashes etc. As soon as that is done, the decrypted data stored in memory is dumped into 7-8 different tables over approximately 200,000 rows. Let me check.

Also, while I have tried this across three different OS installations (Windows 10, WSL+Ubuntu 20.04, Ubuntu 18.04), the hardware was the same. I will check to see if this can be reproduced on other hardware.

dyedgreen commented 4 years ago

Yeah, I feel like the issue has probably to do with the size of the database file, so I think we should try to just generate a lot of random text or similar, until we hit the issue 😄

The fact that the issue turns up under different OSes also seems to point in that direction.

s-i-e-v-e commented 4 years ago

There is a read_bytes, which is an int; maybe it should be an sqlite_int64?

Do queries on physical as well as in-memory databases flow though the same pathways? Because, the :memory: database works perfectly.

If there was a way to store the in-memory database to disk, that might work as well. Is it possible to expose the sqlite3_backup_xxxx api (edit: The VACUUM INTO command might do the same thing, according to the documentation)?

That said, it won't solve the actual underlying issue, assuming I can reproduce it on different hardware.

s-i-e-v-e commented 4 years ago

Yeah, I feel like the issue has probably to do with the size of the database file, so I think we should try to just generate a lot of random text or similar, until we hit the issue

See if this triggers the issue for you:

deno run --allow-read=. --allow-write=. https://raw.githubusercontent.com/s-i-e-v-e/docs/9c29953f0e4b392f44d3787c41097d30ae08ee01/test/Stress.ts

I got this error:

error: Uncaught RuntimeError: memory access out of bounds
    at memset (:0:509349)
    at denoRead (:0:3853)
    at syncJournal (:0:33455)
    at sqlite3PagerCommitPhaseOne (:0:26649)
    at sqlite3BtreeCommitPhaseOne (:0:27833)
    at sqlite3VdbeHalt (:0:43073)
    at sqlite3VdbeExec (:0:62650)
    at sqlite3_step (:0:50777)
    at step (:0:3270)
    at DB.query (db.ts:204:31)
dyedgreen commented 4 years ago

Yeah, that seems to reproduce the error. I'm pretty sure at this point that it's a problem with overflowing the int in denoWrite.

Do queries on physical as well as in-memory databases flow though the same pathways? Because, the :memory: database works perfectly.

Only file-based databases make use of the VFS code, where I suspect the problem is.

I'll try to fix this asap, but please bare with me, as I'm a little busy at the moment 😅

s-i-e-v-e commented 4 years ago

I'll try to fix this asap, but please bare with me, as I'm a little busy at the moment

Take your time. My workload involves writing to the DB exactly once. I was able to use VACUUM INTO to work around this bug. Those who want to read-write from the database should be able to use a combination of ATTACH DATABASE and VACUUM INTO to make it work.