nikku / node-xsd-schema-validator

A schema (XSD) validator for NodeJS
https://www.npmjs.com/package/xsd-schema-validator
MIT License
51 stars 24 forks source link

WIP: Experimental version of using GraalVM to statically compile the XMLValidator to native binaries for macOS, Linux and Windows #40

Open Edgar-P-yan opened 7 months ago

Edgar-P-yan commented 7 months ago

Which issue does this PR address?

Closes #25 #20

Hi! Great library, some time ago i really needed this kind of a tool, but didn't find it.

Right now this lib requires a fully working Java runtime, which is really not convenient. I think a better solution would be to statically compile the Java part using GraalVM Native Images. I am not a Java developer, but the experimental version that i made passes all the unit tests.

I don't know if GraalVM can compile for FreeBSD, OpenBSD or any other less popular OSes, but seems like it compiles for macOS, Linux and Windows without problems. Maybe it would be a good idea to include both versions to the library: the ahead-of-time compiled binaries (does not require JVM, better startup times) and the JVM version just in case if GraalVM does not cover a specific platform.

Furthermore, GraalVM plans on supporting WASM as a target for compilation. When that gets implemented it would potentially resolve any cross-platform issues, but for now i think static binaries are a good way to go.

My fork includes a slightly modified lib/validator.js and a GitHub Action that compiles for these 3 platforms and pushes the binaries to support/dist. I tested the binaries only on my macOS machine, in the future other platforms may be tested too in the CI.

Please let me know what you think about this. Thanks.

nikku commented 6 months ago

@Edgar-P-yan Thanks for your work on this topic.

I'm happy to accept a contribution that fetches and executes the binary from a previously defined (cross-compiled?) path, if it exists. I'm not OK with bloating the installation size of this library to 100MB for all users, and not having a fallback in case a binary does not exists for your platform (ARM vs. other Macs, ...?).

[...] GraalVM plans on supporting WASM as a target for compilation. When that gets implemented it would potentially resolve any cross-platform issues, but for now i think static binaries are a good way to go.

This will indeed be interesting. If we can get portable binaries of smaller sizes.

nikku commented 6 months ago

I believe in all environments there exists the option to make Java available, too.

Edgar-P-yan commented 6 months ago

@nikku Hi! I agree that including all the binaries in the resulting package is not a grate idea. We could make a post-installation script that would download the binary for the platform it runs on (like puppeteer does for example). And in case if the platform is not supported we'd fall back to the JVM version, which would be always included in the package as it is just a single small .java file.

I believe in all environments there exists the option to make Java available, too.

I agree, but ideally installing library that just does validation should be as easy as installing any JSON-validation library. Also installing Java into Docker image alongside node.js may require some digging depending on what base image is being used.

There is also a third option as an alternative to WASM, the TeaVM. It compiles Java to JavaScript (it also has an experimental compilation to WASM, but it's unstable). The problem with TeaVM is that it does not fully implement the standard library, mainly it does not implement the required modules for working with XML. There might be a way to somehow take the standard library from OpenJDK and compile it using TeaVM, but i am not sure.

That way we would just need to include a completely cross-platform JavaScript file that would be both performant and with almost zero startup-time. I'll try to make it work when i'll have some spare time.

nikku commented 6 months ago

@Edgar-P-yan Just as food for thought: If you use NodeJS you need to have Python installed, too, or a GCC or related compiler suite or some native optimizations don't work. I think it is best to keep the general setup simple enough, especially if it gets the job done really well.

If you find a Java-free alternative I'm happy to also link it prominently in the repository readme.