Closed duvifn closed 9 months ago
Hello @duvifn,
I think I have tried to disable almost everything that is related to threading/concurrency to avoid any headaches for the compilation. And I am afraid I am not up-to-date with all the developments in emscripten etc to support threads and so forth. I am not sure if in your use case having a SharedWorker necessarily involves a shared db connection? I think a starting point could be to look into SQLite's own WebAssembly port or SQL.js to see if it is possible in SQLite. If so then it may well be possible with SpatiaLite.
Hi, thanks for your reply.
Hope I correctly understood you.
I'm not looking for a shared connection/threading in the WebAssembly level.
Instead I'm looking for something equivalent to an sqlite
single process running on a server which accept requests (similar to what is done now in spl.js
with a dedicated worker).
So, there is only one process that actually access the DB, but it accepts requests from many clients, instead of one.
For my use case it's fine that there would be only one connection per db.
I have to confess I am not familiar with the SharedWorker API. So if all reads/writes are sequential then from the point of view of the db worker it should not be a big difference. Did you try to just replace Worker with a SharedWorker instance (I'd guess it requires a few adjustments). If you would create a test branch I could maybe help you a bit.
Did you try to just replace Worker with a SharedWorker instance (I'd guess it requires a few adjustments). If you would create a test branch I could maybe help you a bit.
Thanks.
I created this branch with experimental SharedWorker
support. It avoids dealing with any complex concurrency problems, by simply creating a db connection per user port. This is a WIP and I wrote only a few tests for it.
Note that by default sqlite
supports any number of concurrent read transactions but only one write transaction (in the default mode). In case of more than one writing transaction sqlite
would fail with SQLITE_BUSY
. The user can provide a parameter for busy timeout
to make sqlite
waiting for a given time before failing with that error.
By default, if the user doesn't use explicit transaction (denoted with BEGIN...
and COMMIT
), this is not a problem at all since JS execution model guarantees to have only one transaction at time. In my implementation, in a case of explicit transaction it's up to the user to manage write transactions ~or to supply a reasonable busyTimeout
~ (Edit: I suspect supporting timeout would force me to add ASYNCIFY
handling, so I will leave it for now) and to handle errors caused by concurrent write accesses to a db.
Another limitation is that currently, extensions should be defined on worker creation only, and can't be updated after that (I didn't want to change current code model, but it could be changed relatively easily). That means that only the first page to initialize the shared worker can create extensions. If you have any comments I would be glad to hear.
Thank you very much!
Thanks for sharing the code @duvifn. Please give me a bit of time to read it and go through the SharedWorker
docu essentials.
Hi @duvifn,
I read the SharedWorker documentation and looked at a few examples. My first impression was that this quite interesting and should not be too difficult to just replace the dedicated Worker with the SharedWorker (maybe supported by an optional setting like shared: true
) but then I realized that it may be impossible without a major rework because the ObjectURL for the worker script can not be shared (meaning every context creates it's own URL). Is this "magic" code related to the problem? Is it a work around and did you test if it works a well with multiple browser taps that share the worker/database?
@jvail Thanks for reviewing this.
You're right about the ObjectURL
which its guid
is unique, it is not predictable, and its lifetime is tied to the lifetime of the page that creates it. That why I'm using base64 data uri
here.
The MDN documentation about the name
attribute is a bit missing and it is also a bit misleading.
The spec says:
If all of the following conditions are true:
- worker storage key equals outside storage key;
- scope's closing flag is false;
- scope's constructor url equals urlRecord; and
- scope's name equals the value of option's name member then: Set worker global scope to scope.
So if the worker URL
and the name
attribute are equal then the same worker would be used.
The magic
you linked above is used to calculate the name
attribute if this is not given by a user. It's an optional attribute and in my implementation it serves as a guard against usage of different versions of spl.js
by different pages (in such a case, an undebuggable problems would probably occur). The digest is unique for a binary version since it's computed on the worker source and the wasm binary.
I have tried this with pages of the same origin and it works. I also added today an option to load the worker script from a custom URL if the user prefer this for whatever reason.
Thank you @duvifn,
You're right about the
ObjectURL
which itsguid
is unique, it is not predictable, and its lifetime is tied to the lifetime of the page that creates it. That why I'm usingbase64 data uri
here.
I gave it a try with a simple example and indeed with a data uri it seems to work although on stack overflow someone complained it would set the secure context to false... It seems it's not the case.
This is all very interesting but a bit hard to digest. But the reason may well be that I have not read my own code for quite some time ;)
However a few questions:
@jvail
although on stack overflow someone complained it would set the secure context to false
Can you share a link? I'm not sure what does this mean in the context of a SharedWorker.
You have the WeakMap per MessagePort. Does this mean that every "tab" has its own set of databases and they can not access the dbs of a another "tab"?
No. This means that every browsing context (windows, iframes, workers, etc...) has its own set of database connections. Different browsing context can access any db managed by this worker, according to the default behavior of sqlite
(detailed in my comment above).
Do we need something else as message id, something random rather than a number?
No. Every MessagePort
uses its own set of message id which defined by the browsing context that post a message to the worker.
Would it be possible to demonstrate it in a simple example (maybe just a slightly modified copy of an existing one) with opening multiple taps to share a db - if that is intended (1.)
A very simple example (put the two files on a server, and import spl
according to your server configuration):
index.html
:<!DOCTYPE html>
<html>
<head>
<title>Test</title>
</head>
<body>
<script type="module">
import SPL...;
const spl = await SPL(undefined, {sharedWorker: true});
const db = spl.db("/test");
let script = `
SELECT InitSpatialMetaData(1);
CREATE TABLE tbl (id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT);
SELECT AddGeometryColumn('tbl', 'geometry', 4326, 'GEOMETRY', 'XY');
`;
const geom = `{"type":"Polygon", "coordinates":[[[35.021251,48.4648823],[35.0212088,48.4648072],[35.0220692,48.4645943],[35.0221524,48.4647422],[35.022051,48.4647672],[35.0220102,48.4646946],[35.0216271,48.4647893],[35.021251,48.4648823]]]}`;
script += `\nINSERT INTO tbl (geometry) VALUES (SetSRID(GeomFromGeoJSON('${geom}'), 4326));`;
await db.read(script);
</script>
<iframe src="iframe.html"></iframe>
</body>
</html>
iframe.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>New Frame</title>
</head>
<body>
<script type="module">
import SPL...;
(async () => {
// Wait 5 seconds
await new Promise((resolve)=> setTimeout(resolve, 5000));
const spl = await SPL(undefined, {sharedWorker: true});
const db = spl.db("/test");
console.log(await db.exec(`SELECT * FROM tbl;`).get.first);
})();
</script>
</body>
</html>
although on stack overflow someone complained it would set the secure context to false
Can you share a link? I'm not sure what does this mean in the context of a SharedWorker.
https://stackoverflow.com/questions/73098559/data-url-web-worker-loses-security-context
You have the WeakMap per MessagePort. Does this mean that every "tab" has its own set of databases and they can not access the dbs of a another "tab"?
No. This means that every browsing context (windows, iframes, workers, etc...) has its own set of database connections. Different browsing context can access any db managed by this worker, according to the default behavior of
sqlite
(detailed in my comment above).
Ok, I need to play a bit with your example. I thought we could maybe make all databases available to all tabs/contexts without actually knowing the path.
A very simple example (put the two files on a server, and import
spl
according to your server configuration):
Thank you. I'll continue from there. I always need examples to make up my mind.
Hi @duvifn,
apologies for the late reply: I think this is a very interesting feature and thank you for sharing your code. But I came to the conclusion that I would like to wait a little bit for various reasons:
P.S.: Forgot to mention that as part of the little refactoring I could also split the single file into client, worker and wasm file. I'd guess this would also make implementing a SharedWoker abit easier. If you have other ideas then let me know.
@jvail thanks.
But I came to the conclusion that I would like to wait a little bit for various reasons
I respect that.
I am not entirely sure if the feature will work reliable in any case with a shared database especially with multiple writes, open transactions etc. This might be a relevant post at emscripten
In my implementation multiple writes should work with implicit transactions because JS event driven architecture guarantees that only one transaction happens at a given time.
With explicit transaction it's up to the user to manage this but in case of mistake sqilte
should throw an error. I have a test that ensures exactly this.
Could you maintain your feature within a fork
I continue to push commits with improvements as I need (as you can see in my branch). I actually use this branch so any issue is fixed in this branch.
Salut @duvifn,
not sure if you are still interested in working on the sharedworker idea... However, I have finally managed to put together a release so I can start thinking about the next ideas towards a version 1.0:
Let me know what the status of your project is and if you are still working on it. Unfortunately I can not promise to move on much quicker... it will all take time.
Closing for now ...
Hello @jvail, This is a feature request. I have a use case where it would be useful to access a db from a different iframe/window.
Is it possible to make
spl.js
running from aSharedWorker
rather than a dedicated worker?Thank you very much!