Closed jameslan closed 1 day ago
From the beginning of libxml2-wasm design, we tried to avoid direct file access, although webassembly provides its way to bridge the standard C file IO to javscript counterpart. The reason is simple, JS engines don't have an file system API for all platform: web browsers have FileSystemAPI while nodejs doesn't; nodejs has fs module while web browsers don't. This makes local file system support complicated and we may have to release multiple editions to support in different environments. So we decide to let libxml2-wasm to be IO free and leave all that to the client.
With the same reason, libxml's virtual IO seems to be the right solution: libxml2-wasm registers callbacks into libxml, and these libxml2-wasm defined callbacks forward the calls to client defined callbacks(most likely being organized in an interface/class), which do the actual IO(with files, http server etc).
It will be like,
Client registers the InputProvider with
function xmlRegisterInputProvider(provider: XmlInputProvider)
Where XmlInputProvider
is defined as,
interface XmlInputProvider {
match(...): boolean;
open(...): any;
read(...): number;
close(...): boolean;
}
binding/exported-functions.txt
and run npm run link
src/libxml2raw.d.ts
src/libxml2.mts
directly or with string parameter conversionsrc/libxml2.mts
, with static function extracting the field of the structlibxml2.addFunction
to convert javascript function into a wasm function callback. see emscripten's docThere's still some issues on libxml side:
userData
parameter in callback's prototype as well as in the callback register function:
void xmlCtxtSetErrorHandler (xmlParserCtxtPtr ctxt,
xmlStructuredErrorFunc handler,
void *userData);
typedef void (*xmlStructuredErrorFunc) (void *userData, xmlErrorPtr error);
we use it as an index in the storage, storing the stateful context data(error). As a result, we could register one callback and reuse it with different userData.
In the contrast, IO callbacks register function is
int xmlRegisterInputCallbacks (xmlInputMatchCallback matchFunc,
xmlInputOpenCallback openFunc,
xmlInputReadCallback readFunc,
xmlInputCloseCallback closeFunc);
Without userData
, to support multiple client-callbacks, we have to deal with the logic of managing thme within libxml2-wasm callback function.
We could start from supporting only one synchronous provider
@jameslan I gave this a try in the context of #21.
Changes attached: fshooks.patch
I realize now that we should not add the dependency to Node in libxml2.mts, but delegate it to the client. Also probably the provided implementations are not in the best quality. These things we can change easily.
OTOH, the real issue I'm having is that I cannot get libxml2 to call these callbacks. It just reports "No such file or directory", without calling my callbacks.
Maybe you can figure out what's going wrong.
The C code won't directly call javascript code. It needs a wrapper to convert the javascript function into a C function pointer.
Emscripten has an addFunction
to do that: https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#calling-javascript-functions-as-function-pointers-from-c
You can refer to the use of xmlCtxtSetErrorHandler
and the creation of errorCollector
.
I gave this a try in #21
Still I don't see libxml2 calling my callbacks. The error message remains the same: "Error: failed to load "./test/crossplatform/testfiles/book.xsd": No such file or directory"
Sorry, silly mistake. I put the before function in the wrong describe block. 🤦♂️
Now I see that the callbacks are called. Can continue.
PR #21 demonstrates the XSD inclusion.
Libxml uses callbacks for virtual IO, which provide the content of xml file when libxml needs a particular file.
See
stackoverflow question: https://stackoverflow.com/questions/13470166/add-additional-xsd-schemas-with-libxml2 libxml example: http://www.xmlsoft.org/examples/#InputOutput