Closed rogerbinns closed 6 months ago
I have to admit that I have never experimented with the SQLITE_CONFIG_MMAP_SIZE
config parameter.
CorruptError is returned executing the select. Leaving out the hexkey also does not result in error.
Leaving out calling PRAGMA hexkey
results in an unencrypted database file. That is, the encryption code is not called.
What happens, if you use PRAGMA key
instead? Same error? Most likely yes.
At the moment I have no idea, what may cause the error. I'll have to try to reproduce the error in my development environment to find out what is going wrong.
In the meantime I made a few experiments using the SQLite shell coming with the SQLite3 Multiple Ciphers releases.
The parameter MMAP_SIZE
can also be set via PRAGMA mmap_size
.
First, I used an older version of the shell I had at hand from other tests. If I used a value of 8192 I got the error "database disk image is malformed". With smaller values the error went away, although I did not reopen the database, but just reissued the SELECT
command.
Then I used the shell from the latest release. But I was unable to reproduce the error. I could set mmap_size
to any value.
This makes it extremely difficult to track down the cause of the error.
It doesn't matter whether key or hexkey is used. If I catch the CorruptError, turn mmap back off, and do a read again then all is fine. I also double checked the database is encrypted, and that calling the vfs xRead method does give the right results.
The issue in your code is almost certainly how VFS xFetch is handled.
I put a break point where I can get the sqlite3_io_methods and it does look correct:
(gdb) p *fp.pMethods
$6 = {iVersion = 3, xClose = 0x7fffe9833de0 <mcIoClose>,
xRead = 0x7fffe9845070 <mcIoRead>,
xWrite = 0x7fffe9844dc0 <mcIoWrite>,
xTruncate = 0x7fffe9826bb0 <mcIoTruncate>,
xSync = 0x7fffe9826bc0 <mcIoSync>,
xFileSize = 0x7fffe9826bd0 <mcIoFileSize>,
xLock = 0x7fffe9826be0 <mcIoLock>,
xUnlock = 0x7fffe9826bf0 <mcIoUnlock>,
xCheckReservedLock = 0x7fffe9826c00 <mcIoCheckReservedLock>,
xFileControl = 0x7fffe98c0970 <mcIoFileControl>,
xSectorSize = 0x7fffe9826c10 <mcIoSectorSize>,
xDeviceCharacteristics = 0x7fffe9826c40 <mcIoDeviceCharacteristics>,
xShmMap = 0x7fffe9826c50 <mcIoShmMap>,
xShmLock = 0x7fffe9826c60 <mcIoShmLock>,
xShmBarrier = 0x7fffe9826c70 <mcIoShmBarrier>,
xShmUnmap = 0x7fffe9826c80 <mcIoShmUnmap>,
xFetch = 0x7fffe9826ca0 <mcIoFetch>,
xUnfetch = 0x7fffe9826cc0 <mcIoUnfetch>}
The changes APSW makes to SQLite defaults are in src/apsw.c and only if statically compiling in the amalgamation which I am doing in this case.
#define SQLITE_OMIT_DEPRECATED
#ifndef SQLITE_MAX_ATTACHED
#define SQLITE_MAX_ATTACHED 125
#endif
#ifndef SQLITE_MAX_MMAP_SIZE
#define SQLITE_MAX_MMAP_SIZE 0x1000000000000LL
#endif
#ifndef APSW_NO_NDEBUG
#define SQLITE_API static
#define SQLITE_EXTERN static
#endif
#ifdef SQLITE_DEBUG
#define SQLITE_ENABLE_API_ARMOR 1
#endif
Builds for PyPI also add SQLITE_ENABLE_COLUMN_METADATA configured in tools/setup-pypi.cfg
It doesn't matter whether key or hexkey is used.
Any other answer would have surprised me.
If I catch the CorruptError, turn mmap back off, and do a read again then all is fine. I also double checked the database is encrypted, and that calling the vfs xRead method does give the right results.
Maybe the SQLite compile time options used to compile APSW differ from those I typically use for the SQLite shell.
The issue in your code is almost certainly how VFS xFetch is handled.
xFetch
is simply forwarded to the xFetch
method of the underlying real VFS.
xFetch is simply forwarded to the xFetch method of the underlying real VFS.
That can't be right for mcIoFetch. mcIoRead
calls mcReadMainDb
which does decryption. So mcIoFetch
also needs to do decryption as does mcIoUnfetch
need to do encryption. I'll claim the result of fetch being used because of memory mapping means that the raw encrypted data is coming back, hence the corruption error.
A simple fix is to make your interposer sqlite3_io_methods cap iVersion
at 2.
xFetch is simply forwarded to the xFetch method of the underlying real VFS.
That can't be right for mcIoFetch.
mcIoRead
callsmcReadMainDb
which does decryption. SomcIoFetch
also needs to do decryption as doesmcIoUnfetch
need to do encryption. I'll claim the result of fetch being used because of memory mapping means that the raw encrypted data is coming back, hence the corruption error.
I will further analyze what SQLite does when using xFetch/xUnFetch. It is a bit irritating that writing encrypted content seems to work properly, in contrast to reading.
A simple fix is to make your interposer sqlite3_io_methods cap
iVersion
at 2.
As a quick fix that would probably work.
I did try setting the xFetch entry in mcIoMethodsGlobal3
to 0 but then there is a SEGV in sqlite3OsFetch
which isn't checking for a non-NULL pointer.
Setting iVersion
to 2 in the same structure did make the issue go away.
I did try setting the xFetch entry in
mcIoMethodsGlobal3
to 0 but then there is a SEGV insqlite3OsFetch
which isn't checking for a non-NULL pointer.
Setting method pointer to 0 is not a good idea. SQLite doesn't check for null pointer at some critical places (as I found out when I tried to add SQLite3 Multiple Ciphers to the WASM module).
Setting
iVersion
to 2 in the same structure did make the issue go away.
Setting iVersion
to 2 will prevent that xFetch and xUnfetch will be called.
That is, your analysis is absolutely right that those 2 methods do not what they should, if encryption is enabled.
Ok, in the meantime I inspected the SQLite code. IMHO the methods xFetch
/xUnfetch
must not be called, if the database is encrypted. xFetch
returns pointers to memory managed by the OS, and this means the content has to be the same as it is on disk. That is, unencrypted for a plain database or encrypted for an encrypted database. Decrypting the page content in xFetch
would most likely cause database corruption.
The real question is, why method xFetch
is called at all? The internal SQLite function setGetterMethod
has been patched to check whether the database is encrypted or not. If memory mapping is supported and the database is not encrypted, the getter method getPageMMap
will be selected; if the database is encrypted, the getter method getPageNormal
will be selected. Thus, memory mapping should not be used, and xFetch
should not be called. Why does it happen in the APSW environment?
@rogerbinns: Could you please find out which page getter method is set and whether it is correctly detected that the database is encrypted? The call stack when xFetch
is entered could possibly be useful, too.
There are wiki instructions which is all I am doing to run the code at the top. This is with the updated sqlite3.c in that repo putting a breakpoint ion mcIoFetch
#0 mcIoFetch (pFile=0xc94b20, iOfst=4096, iAmt=4096, pp=0x7fffffffcc60)
at /space/mc/sqlite3/sqlite3.c:307830
#1 0x00007fffe9949a7d in sqlite3OsFetch (pp=0x7fffffffcc60,
iAmt=<optimised out>, iOff=<optimised out>, id=<optimised out>)
at /space/mc/sqlite3/sqlite3.c:26372
#2 getPageMMap (pPager=0xc94998, pgno=2, ppPage=0x7fffffffccb0, flags=2)
at /space/mc/sqlite3/sqlite3.c:62913
#3 0x00007fffe9925225 in sqlite3PagerGet (flags=2,
ppPage=0x7fffffffccb0, pgno=2, pPager=<optimised out>)
at /space/mc/sqlite3/sqlite3.c:62975
#4 getAndInitPage (pBt=0xc0fa08, pgno=2, ppPage=ppPage@entry=0xc9e978,
bReadOnly=2) at /space/mc/sqlite3/sqlite3.c:73078
#5 0x00007fffe9925a1c in moveToRoot (pCur=pCur@entry=0xc9e8f0)
at /space/mc/sqlite3/sqlite3.c:76207
#6 0x00007fffe9925d5f in sqlite3BtreeFirst (pCur=0xc9e8f0,
pRes=0x7fffffffce60) at /space/mc/sqlite3/sqlite3.c:76314
#7 0x00007fffe99bb67b in sqlite3VdbeExec (p=p@entry=0xc9f678)
at /space/mc/sqlite3/sqlite3.c:99562
#8 0x00007fffe99cd440 in sqlite3Step (p=<optimised out>)
at /space/mc/sqlite3/sqlite3.c:91371
#9 sqlite3_step (pStmt=<optimised out>)
at /space/mc/sqlite3/sqlite3.c:91432
#10 sqlite3_step (pStmt=<optimised out>)
at /space/mc/sqlite3/sqlite3.c:91421
#11 0x00007fffe99ddf94 in APSWCursor_step (
self=self@entry=0x7fffe9d85bc0) at src/cursor.c:736
Hopefully this is enough.
pPager in frame 2
(gdb) p *pPager
$3 = {pVfs = 0xc77738, exclusiveMode = 0 '\000', journalMode = 0 '\000',
useJournal = 1 '\001', noSync = 0 '\000', fullSync = 1 '\001',
extraSync = 0 '\000', syncFlags = 2 '\002', walSyncFlags = 10 '\n',
tempFile = 0 '\000', noLock = 0 '\000', readOnly = 0 '\000',
memDb = 0 '\000', memVfs = 0 '\000', eState = 1 '\001',
eLock = 1 '\001', changeCountDone = 0 '\000', setSuper = 0 '\000',
doNotSpill = 0 '\000', subjInMemory = 0 '\000', bUseFetch = 1 '\001',
hasHeldSharedLock = 1 '\001', dbSize = 21, dbOrigSize = 2,
dbFileSize = 21, dbHintSize = 21, errCode = 0, nRec = 0,
cksumInit = 4228306223, nSubRec = 0, pInJournal = 0x0, fd = 0xc94b20,
jfd = 0xc94cb0, sjfd = 0xc94be8, journalOff = 0, journalHdr = 0,
pBackup = 0x0, aSavepoint = 0x0, nSavepoint = 0, iDataVersion = 4,
dbFileVers = "\000\000\000\002\000\000\000\025\000\000\000\000\000\000\000", nMmapOut = 0, szMmap = 281474976710656, pMmapFreelist = 0x0,
nExtra = 136, nReserve = 32, vfsFlags = 33554694, sectorSize = 512,
mxPgno = 4294967294, lckPgno = 262145, pageSize = 4096,
journalSizeLimit = -1, zFilename = 0xc94d84 "/space/mc/testdb",
zJournal = 0xc94d96 "/space/mc/testdb-journal",
xBusyHandler = 0x7fffe987d680 <btreeInvokeBusyHandler>,
pBusyHandlerArg = 0xc0fa08, aStat = {9, 0, 23, 0},
xReiniter = 0x7fffe9925130 <pageReinit>,
xGet = 0x7fffe99498a0 <getPageMMap>, pTmpSpace = 0xc94dd8 "",
pPCache = 0xc94ad0, pWal = 0x0, zWal = 0xc94daf "/space/mc/testdb-wal"}
(gdb) p *pPager->pVfs
$5 = {iVersion = 3, szOsFile = 200, mxPathname = 512,
pNext = 0x7fffe9ae61a0 <aVfs.2599>,
zName = 0xc777f0 "multipleciphers-unix",
pAppData = 0x7fffe9ae61a0 <aVfs.2599>,
xOpen = 0x7fffe984ee50 <mcVfsOpen>,
xDelete = 0x7fffe982b190 <mcVfsDelete>,
xAccess = 0x7fffe982b1a0 <mcVfsAccess>,
xFullPathname = 0x7fffe982b1b0 <mcVfsFullPathname>,
xDlOpen = 0x7fffe982b1c0 <mcVfsDlOpen>,
xDlError = 0x7fffe982b1d0 <mcVfsDlError>,
xDlSym = 0x7fffe982b1e0 <mcVfsDlSym>,
xDlClose = 0x7fffe982b1f0 <mcVfsDlClose>,
xRandomness = 0x7fffe982b200 <mcVfsRandomness>,
xSleep = 0x7fffe982b210 <mcVfsSleep>,
xCurrentTime = 0x7fffe982b220 <mcVfsCurrentTime>,
xGetLastError = 0x7fffe982b230 <mcVfsGetLastError>,
xCurrentTimeInt64 = 0x7fffe982b240 <mcVfsCurrentTimeInt64>,
xSetSystemCall = 0x7fffe982b250 <mcVfsSetSystemCall>,
xGetSystemCall = 0x7fffe982b260 <mcVfsGetSystemCall>,
xNextSystemCall = 0x7fffe982b270 <mcVfsNextSystemCall>}
There are wiki instructions which is all I am doing to run the code at the top.
I will set up the environment in my Linux environment and will try to further investigate the issue.
This is with the updated sqlite3.c in that repo putting a breakpoint ion mcIoFetch
#0 mcIoFetch (pFile=0xc94b20, iOfst=4096, iAmt=4096, pp=0x7fffffffcc60) at /space/mc/sqlite3/sqlite3.c:307830 #1 0x00007fffe9949a7d in sqlite3OsFetch (pp=0x7fffffffcc60, iAmt=<optimised out>, iOff=<optimised out>, id=<optimised out>) at /space/mc/sqlite3/sqlite3.c:26372 #2 getPageMMap (pPager=0xc94998, pgno=2, ppPage=0x7fffffffccb0, flags=2) at /space/mc/sqlite3/sqlite3.c:62913 #3 0x00007fffe9925225 in sqlite3PagerGet (flags=2, ppPage=0x7fffffffccb0, pgno=2, pPager=<optimised out>) at /space/mc/sqlite3/sqlite3.c:62975 #4 getAndInitPage (pBt=0xc0fa08, pgno=2, ppPage=ppPage@entry=0xc9e978, bReadOnly=2) at /space/mc/sqlite3/sqlite3.c:73078 #5 0x00007fffe9925a1c in moveToRoot (pCur=pCur@entry=0xc9e8f0) at /space/mc/sqlite3/sqlite3.c:76207 #6 0x00007fffe9925d5f in sqlite3BtreeFirst (pCur=0xc9e8f0, pRes=0x7fffffffce60) at /space/mc/sqlite3/sqlite3.c:76314 #7 0x00007fffe99bb67b in sqlite3VdbeExec (p=p@entry=0xc9f678) at /space/mc/sqlite3/sqlite3.c:99562 #8 0x00007fffe99cd440 in sqlite3Step (p=<optimised out>) at /space/mc/sqlite3/sqlite3.c:91371 #9 sqlite3_step (pStmt=<optimised out>) at /space/mc/sqlite3/sqlite3.c:91432 #10 sqlite3_step (pStmt=<optimised out>) at /space/mc/sqlite3/sqlite3.c:91421 #11 0x00007fffe99ddf94 in APSWCursor_step ( self=self@entry=0x7fffe9d85bc0) at src/cursor.c:736
As can be seen the page getter method getPageMMap
is called (# 3). This should not happen for an encrypted database. No idea, why it happens nevertheless.
Ok, I tested under Linux based on the wiki instructions. Here are the test results:
APSW debug build: missing sys.apsw_fault_inject_control
Python /home/ulrich/Development/GitHub/apsw-sqlite3mc/.venv/bin/python3 sys.version_info(major=3, minor=10, micro=12, releaselevel='final', serial=0) 64bit ELF
Testing with APSW file /home/ulrich/Development/GitHub/apsw-sqlite3mc/apsw/__init__.cpython-310-x86_64-linux-gnu.so
APSW version 3.46.0.0
SQLite lib version 3.46.0
SQLite headers version 3046000
Using amalgamation True
.............................................................A message due to RecursionError is possible, and what is being tested
.....................................................................
----------------------------------------------------------------------
Ran 130 tests in 31.732s
OK
No CorruptError
. Hm???
You only need the 6 lines of code in this issue report at the top to reproduce. The wiki has instructions at the bottom on how to set a breakpoint in code of your choice.
In commit https://github.com/utelle/apsw-sqlite3mc/commit/2c623e865b26a44751fa12254675912c3dd81ae2 I temporarily worked around the issue which is why it doesn't show up now. That was to allow a build for test pypi to complete.
I've now removed the workaround and added another test that explicitly checks memory mapping.
You only need the 6 lines of code in this issue report at the top to reproduce. The wiki has instructions at the bottom on how to set a breakpoint in code of your choice.
Thanks. In the meantime I have set up the project in my Linux environment, so that I can continue to test there.
In commit utelle/apsw-sqlite3mc@2c623e8 I temporarily worked around the issue which is why it doesn't show up now. That was to allow a build for test pypi to complete.
I've now removed the workaround and added another test that explicitly checks memory mapping.
I see now the CorruptError
and I will try to track down, what's causing the problem. This may take some time.
Actually, it was only a small glitch. The codec pointer was set in the SQLite file structure a bit too late, so that the SQLite pager got the impression that no codec is in place and selected the page getter method for memory-mapped database files.
Commit c95bd21f4fa20328b521a33a57017607ad17f073 should fix the issue.
I committed an updated amalgamation to apsw-sqlite3mc.
I could run all tests without problems. Therefore closing this issue. Reopen if necessary.
The following program causes CorruptError:
CorruptError is returned executing the select. Leaving out the hexkey also does not result in error.
If the mmap value is 8447 or less then the error does not occur. I was setting it to 2 **63 -1.