FontoXML / fontoxpath

A minimalistic XPath 3.1 implementation in pure JavaScript
MIT License
131 stars 17 forks source link

access the AST #217

Closed JoernT closed 4 years ago

JoernT commented 4 years ago

Sorry about my humble question - new to fontoxpath and unfortuantely also not a typescript guy , so please forgive if that's an obvious one:

how can i access the AST for a given XPath expression? I see, that there are functions for that in the source but these do not seem to be present in the dist (fontoxpath.js)

Not sure what it needs to get to that functionality.

Thanks

JosVerburg commented 4 years ago

Hey,

The public API of fontoxpath does indeed not contain any feature to get the AST of an expression. We do print the AST in the "demo" page we use when working on fontoxpath.

You can see here that we get the AST through an internal API (and than format it a bit). You can run this page locally yourself.

Do you need to AST for something specific? So far we've not seen any need to expose the AST through public API.

And feel free to ask if you got any more questions or need some more help :)

JoernT commented 4 years ago

Thanks for the reply. Sure, usually you don't need access to the AST.

I'm just starting off with fontoxpath and use it for implementing a forms framework. Here you have expressions like this that attach constraints to some XML nodes:

<bind ref="foo" required="../bar = 'baz'">

Of course i can just iterate all my bindings and (re)evaluate them. However for more complex cases this will become very expensive. To circumvent that I'd like to know which nodes my given expression depends on (the above bind depends on node 'bar' which is a sibling of 'foo') and build a dependency graph that can be recalculated when nodes change. This way i can only reevaluate changed nodes and their dependencies instead of all (which makes a huge difference).

So, it would be super useful for me to have a way to access it.

DrRataplan commented 4 years ago

Hi, cool! I like what you are doing, reminds me a lot of something I presented at the XML Prague conference a couple of years ago. We were building a schematron engine that could do live updates without recomputing everything all the time. Soft validation in an editor environment in https://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf. We went for a dynamic dependency tracking approach instead of a static one; you could consider that?

To answer your question, I would not mind exposing this API, we are currently parsing XPaths to XQueryX (in JsonML) internally. It would be trivial to implement a public function that accepts the query and a documentWriter and nodesFactory that can parse the query, convert the JsonML AST to an actual DOM and output that.

JoernT commented 4 years ago

Martin,

On Fri, Feb 14, 2020 at 12:03 PM Martin Middel notifications@github.com wrote:

Hi, cool! I like what you are doing, reminds me a lot of something I presented at the XML Prague conference a couple of years ago. We were building a schematron engine that could do live updates without recomputing everything all the time. Soft validation in an editor environment in https://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf. We went for a dynamic dependency tracking approach instead of a static one; you could consider that?

thanks for the pointer to the paper - i'll give it a read. Not fully get the meaning of 'dynamic' versus 'static' here. But i guess i'll need to read first.

Actually my new attempt isn't that much new anymore. Back in the days i Implemented betterFORM which was a highly XForms 1.0 conformant implementation in Java (running on server) that already at its heart used a graph-based recalculation/revalidation engine based upon Saxon. Btw, Saxon also didn't expose the AST so i had to overwrite some internal APIs which in the end now lead to deprecating the whole project due to the high effort required to move from one Saxon version to another.

To answer your question, I would not mind exposing this API, we are

currently parsing XPaths to XQueryX (in JsonML) internally. It would be trivial to implement a public function that accepts the query and a documentWriter and nodesFactory that can parse the query, convert the JsonML AST to an actual DOM and output that.

i would absolutely love to have that function without tweeking fontoxpath codebase to roll my own version - besides from that i'm not at all able to do that by now as my understanding of the internals of fontxpath and build process is rather rudimentary. But i'm very pleased with the results i could get in just a few hours with it.

Do you also use the Schematron approach on the server? For a forms framework of course a second-level (server-side) validation is essential - otherwise it's not worth a penny (though big parts of the world seem to ignore that). That's why i still worked on a half-object implementation that does the hard work on the server and exchanging states with the client. Of course the other reason was that i simply did not know about fontxpath and there's no real alternative for XPath 3 on the client. Nevertheless server-side validation would be stil need a solution. Up-to-now i thought about generated XQueryX for that purpose.

Thanks for all the insights you guys already gave me.

I'm going on with my Friday hacking and see what i can accomplish. There's still a lot the check and do before i get to the advanced validation but again, it would be an immense help to build version with AST exposed. Btw, the whole thing will be in Web Components (lit-element style) and use ES6 imports.

Thanks Joern

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FontoXML/fontoxpath/issues/217?email_source=notifications&email_token=AADWFWY6EBYUIINYR5I7IYDRCZ27TA5CNFSM4KUTQIR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELYOIBQ#issuecomment-586212358, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADWFW5WKBA3WB6VW4FJILDRCZ27TANCNFSM4KUTQIRQ .

DrRataplan commented 4 years ago

Hi Joern,

About the static versus dynamic part: In Fonto, we have a dependency tracking/indexing framework that we use for our invalidations. It works by listening in on all of the DOM accesses that are happening during an expression. That's also why we allow the injection of a 'domFacade'; it's the place where we can intercept the 'getParentNode' call for instance. We dynamically compute the dependencies of an expression. This is in contrast with another approach where you would use the AST to 'scan' which changes could affect the expression, statically. I see the betterFORM implementation used the second variant, which is very cool.

At this moment, Fonto only includes client-side validations, both in Schematron as well as in XSD. Fonto usually does not provide its own CMS, so customers are strongly advised to do that. How they do that is up to them when Fonto saves the document. But agreed, not everyone does that... We do see that while validating client-side is not the easiest task, it does allow for way more responsive and helpful UIs. There is no need for a 'validate' button when you do client-side validation (as long as you are smart with recomputing only XPaths that may have changed).

JoernT commented 4 years ago

Martin,

thanks for your insights and clarification of static versus dynamic - just don't got it from the context.

Anyway - i spent some hours more with fontoxpath and i'm very impressed with it. Works very well for the use case. I still have to convince my fellows (though it's mainly my domain) but i think with fontoxpath it's a real option to implement a XForms engine in the client. That's not a project to be done in 2 weeks (you know by yourself what it means to implement W3C specs ;)

Regarding client-side validation: XForms already defines a lifecycle when this does happen. Essentially it defines a state engine that makes sure that data and validations cannot get out of sync. Nevertheless IMO it's absolutely necessary for a forms soutions to have second-level validation on the server. You can't rely on users to do that by themselves - that simply won't happen as this task is neither fun nor even thought about a lot. I guess your use case is a bit different so this probably does not apply in the same way.

For server-side validation i currently think of generated XQueryX code. As XForms is declarative it's rather easy to iterate the binding expressions and generate the corresponding validations. We already did that before as a proof of concept. The fact that this validation only has to happen once (when the user wants to submit the form data) simplifies this.

Yes, i just can emphasize it again - it would super-useful to get the AST via an API function. Would love it as i'm not a fan of hacking your source to get it. Ran into trouble with such an approach with Saxon before ;)

Joern

On Mon, Feb 17, 2020 at 3:48 PM Martin Middel notifications@github.com wrote:

Hi Joern,

About the static versus dynamic part: In Fonto, we have a dependency tracking/indexing framework that we use for our invalidations. It works by listening in on all of the DOM accesses that are happening during an expression. That's also why we allow the injection of a 'domFacade'; it's the place where we can intercept the 'getParentNode' call for instance. We dynamically compute the dependencies of an expression. This is in contrast with another approach where you would use the AST to 'scan' which changes could affect the expression, statically. I see the betterFORM implementation used the second variant, which is very cool.

At this moment, Fonto only includes client-side validations, both in Schematron as well as in XSD. Fonto usually does not provide its own CMS, so customers are strongly advised to do that. How they do that is up to them when Fonto saves the document. But agreed, not everyone does that... We do see that while validating client-side is not the easiest task, it does allow for way more responsive and helpful UIs. There is no need for a 'validate' button when you do client-side validation (as long as you are smart with recomputing only XPaths that may have changed).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FontoXML/fontoxpath/issues/217?email_source=notifications&email_token=AADWFW4EUZFC7QUXTRCFKIDRDKPUTA5CNFSM4KUTQIR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL6VQBI#issuecomment-587028485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADWFW3A6Q5XY6FD46K3V3LRDKPUTANCNFSM4KUTQIRQ .

DrRataplan commented 4 years ago

HI Joern, I just merged the new parseScript feature. Is this something that you can use for your cause? Please keep us posted on your progress. If we can help you in any way, please reach out!

JoernT commented 4 years ago

parseScript is for parsing XQuery right? What i actually would need is something that gives me the AST for an XPath expression.

Consider this example descriptive binding.

<xf-bind ref="aNode" constraint=". < 10" required="../b ='foo'" relevant="../c/d ='yes'"></xf-bind>

A typical form will contain n of these to establish 'the validity contract' of the form.

What my engine is supposed to do: it will parse all the xf-bind elements and evaluate the above 'constraint', 'required' and 'relevant' properties in context of 'ref'. To efficiently recalculate those properties (there are some more of them possible) i'd like to know the dependencies of a given expression like e.g. for the 'required' property above which is obviously dependent on node 'b'. When i know all these deps for all props i can build a dependency graph an on recalculation only re-process those nodes in the right order that have changed since last graph creation.

That just for further explanation what i try to achieve.

In the meantime i hacked a bit further and managed to setup the basic XForms model processing. Works great so far. Now i'm challenged with writing my first function 'instance' which returns the rootnode of one of the instances in a form.

Btw - if anybody likes to have a look - my efforts are on our public gitlab here - fore - hello sample This is where i put my humble efforts. It's still kind of a mess so be warned. There are just a few components that are actually new (src dir) like xf-form, xf-model, xf-bind and xf-instance. Demo can be run with npm run start if you have that stack running.

Thanks a bunch,

Joern

DrRataplan commented 4 years ago

You can also use parseScript to parse XPath, just set the options.language property to the correct language (see example in code). XQuery and XPath are similar, the only substantial changes (except for function declarations etc) are with character encoding. FontoXPath parses both.

Thanks for that link! I have to admit that my webcomponents knowledge is rusty, but I'll be sure to give it a look! Quick question, would that approach you are describing not run into problems when you introduce more complex queries, like when you use conditions? For example take the query if (@useParent) then ../@value else @value. In this case, you should only need to recompute the whole query when either the useParent attribute changes, and depending on that value when either the value property of the context item, or when the childlist or the value attribute of the parent changes. Or does XForms not allow those constructs? We at fonto use another approach where we intercept Dom access to circumvent those problems. This even allows JavaScript interopts using custom functions that can still be dependency tracked. I am highly curious how you approach that issue, because I think we can learn a lot from such a solution!

Tbh, I have not worked with XForms that often, or at all for that matter, so forgive my ignorance. It is still on my list though.

Thanks! Martin

On Mon, 2 Mar 2020, 11:01 Joern Turner, notifications@github.com wrote:

parseScript is for parsing XQuery right? What i actually would need is something that gives me the AST for an XPath expression.

Consider this example descriptive binding.

<xf-bind ref="aNode" constraint=". < 10" required="../b ='foo'" relevant="../c/d ='yes'">

A typical form will contain n of these to establish 'the validity contract' of the form.

What my engine is supposed to do: it will parse all the xf-bind elements and evaluate the above 'constraint', 'required' and 'relevant' properties in context of 'ref'. To efficiently recalculate those properties (there are some more of them possible) i'd like to know the dependencies of a given expression like e.g. for the 'required' property above which is obviously dependent on node 'b'. When i know all these deps for all props i can build a dependency graph an on recalculation only re-process those nodes in the right order that have changed since last graph creation.

That just for further explanation what i try to achieve.

In the meantime i hacked a bit further and managed to setup the basic XForms model processing. Works great so far. Now i'm challenged with writing my first function 'instance' which returns the rootnode of one of the instances in a form.

Btw - if anybody likes to have a look - my efforts are on our public gitlab here - fore - hello sample https://gitlab.existsolutions.com/eXistdbElements/exform/blob/fonto/demo/hello-fonto.html This is where i put my humble efforts. It's still kind of a mess so be warned. There are just a few components that are actually new (src dir) like xf-form, xf-model, xf-bind and xf-instance. Demo can be run with npm run start if you have that stack running.

Thanks a bunch,

Joern

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/FontoXML/fontoxpath/issues/217?email_source=notifications&email_token=ABGKEJDLUTREZ3U3T3JOPC3RFN7YJA5CNFSM4KUTQIR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENOVZ2I#issuecomment-593321193, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGKEJFTLFSUAKU4NUS2ILTRFN7YJANCNFSM4KUTQIRQ .

JoernT commented 4 years ago

Ah, thanks for the clarification. I just quickly looked at the source and got the impression that it mainly deals with XQuery constructs. I'll try that out.

We use Web Components in all of our products. The admin tools of eXist-db (dashboard) and all of its parts are Web Components and our other main product TEI-Publisher relies heavily on it. Last generation is lit-element which wraps a tiny templating layer around the standard Web Components API. As we try to provide long-term solutions we use Web standards whereever we can. Web Components are a W3C/WHATWG standard and are part of HTML5. Can you be more 'platform'? Our experiences are very positive once you mastered some specifics and it allows us to offer a set of components that users can arrange for their specific needs (layout and rendering of document views).

Trying to answer your dependency question but some remarks about XForms maybe: this standard has by no means be a success in terms of distribution into the market. It simply hasn't got the political support it would have needed by major browser vendors. This way it has stayed kind of invisible. However there is simply no forms framework around that is as powerful as XForms when it comes to serious data management especially with XML. If you like to get a grasp of what it can do please see Steven Pembertons homepage. He's the main promotor of XForms over the years and has some impressing material and demos.

Regarding dependency tracking: XForms does this kind of intercepting DOM mutations also. As there are defined, descriptive actions (setvalue, insert, delete) to change the value of nodes or insert and delete them the respective changes are picked up and put into a list of changes. Those actions set flags that signal that a recalculation/revalidation needs to take place. This doesn't happen immediately but when the block of actions (DOM mutations) has completed.

But there's even more about it. Each node will have a 'ModelItem' object attached which carries the whole state of the node. Besides the value that are the 'facets' for readonly, required, relevant, calculate, valid and datatype. The modelitem reflects the current state of a node in XForms. You can even add your own modelitem properties if you need (very rare case however). The properties use XPath statements to be resolved in context of the referenced node(s) (see bind examples up in the thread).

A block of actions might look like this:

<xf-action>
  <xf-setvalue ref="myElement" value="'foobar'"/>
  <xf-insert ref="items/item" at="last()" origin="templates/item"/>
</xf-action>

The actions will change the state of the modelitems and when the graph recomputes it will use the list of changed nodes to determine the subgraph to be recalculated. Hope this sheds a bit of light on it ;)

Btw, the whole thing is not my invention but a non-normative addendum of the XForms spec. I just put it into Java code back then and it will certainly be quite a package of work to redo that in JS. To be honest it's not highest on my list by now as i'm still in the beginnings - you can actually get quite far even without being perfect ;) (just bulk validation) until you reach the more demanding cases. But from my experience the fully optimized version is only needed in maybe 20% of the cases where you do significant logic for validation/calculation and you forms have a certain level of complexity.

Primary problem for me is to master extension functions with fontoXPath as there are quite a lot to be done for XForms.

Thanks a lot and sorry for the long read.

Joern