dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.29k stars 4.74k forks source link

System.Xml.XPath to support XPath 2, XPath 3 and their XSLT variants #14819

Closed am11 closed 4 years ago

am11 commented 9 years ago

Motivation

System.Xml.XPath currently conforms with XPath 1.0 [W3C-xpath-1] and XSLT 1.0 [W3C-xslt-1] standards, but not XPath 2.0 [W3C-xpath-2], XPath 3.0 [W3C-xpath-3], XPath 3.1 [W3C-xpath-3.1], XSLT 2.0 [W3C-xslt-2] and XSLT 3.0 [W3C-xslt-3].

The missing standard implementations in BCL are required by many consumer scenarios, for which .NET applications rely on third party libraries. One of the chief scenario is Content Query Web Part (CQWP) in SharePoint, where the users' XSLT code can be drastically minimized if v2 is supported by System.Xml.XPath. As for most parts, there are backward compatibility fallbacks available, that is; the code written in XSLT 2 precisely, can be expressed verbosely in XSLT 1 and since so forth.

Pitfalls

Unfortunately, (besides the existing third-party libraries' APIs) I do not have an off-hand -- concrete -- method list to propose, as it requires further brain-storming on whether to auto-select processor based on the input or to explicitly separate the namespaces (System.Xml.XPath2 and System.Xml.XPath3).

The point to ponder being; since the sub-languages XPath 2 and XPath 3 intrinsically facilitates backward compatibility modes, see XPath 2: J.1.3 Backwards Compatibility Behavior and XPath 3: 3.10 Backwards Compatible Processing, should the API be any different than the existing one and let consumers select the standard mode?

maxtoroq commented 7 years ago

FYI, I've created a list of XML libraries, frameworks and tools for .NET

stephen-lim commented 7 years ago

@maxtoroq Thanks for sharing. Exselt is likely dead because it's been kept in Beta since 2013. I tried contacting them several times and they never replied. XmlPrime is probably out of reach for many because their redistribution license is around USD $6000 per annum. As of now, none of them in the list have announced .NET Core compatibility or have a clear path to do so.

nverwer commented 7 years ago

In the beginning of 2016, Exselt was still alive. I used it and spoke with its creator, Abel Braaksma at XML Prague. However, I think that the lack of (paid) interest has lowered its priority for Abel. Exselt was used in a workshop at XML Prague, and I thought it was pretty good.

alirobe commented 7 years ago

@karelz Silverlight/XAML issues, InfoPath issues, Classic -> Modern SharePoint issues, and BizTalk issues, can all be tracked back to lack of this functionality. Again, all Office documents are XML-based. Think of the developer hours wasted creating new experiences and not updating old ones, because of inadequate transformation tooling. The lack of this functionality is not just bad for third party developers. It's fundamentally hampering the competitiveness of existing Microsoft products.

ghost commented 6 years ago

Interesting part is: product-wise, Microsoft is the largest consumer of XML serialization in the world, just search how many *proj and *.config files alone are in existence for MSBuild execution.. then every enterprise product by Microsoft relies on or primarily supports XML data. If IBM and Oracle have heavily invested in XML techs in past two decades to continuously implement new standards, Microsoft should too.

VS validates every single project file against XML schema, yet .NET doesn't support six years old XSD Schema 1.1 standard and all the rest of X-technologies beyond 1.0 standard. If you are curious what XSD 1.1 + XPath 3 can achieve that 1.0 can't, take a look at the biggest feature we miss every day and night in .NET -> "Assertions": https://blogs.infosupport.com/exploring-cool-new-features-of-xsd-1-1/.

The investment is pretty high

Wouldn't it always be the case? Either it will never happen, or it has to start at some point. And if it has to happen at some point, then I think every team in Microsoft that uses XML-based techs can contribute / share the cost for the effort to implement latest recommendations in CoreFX:

https://www.w3.org/standards/techs/xpath https://www.w3.org/standards/techs/xmlschema

stephen-lim commented 6 years ago

@karelz can you reconsider this request? You got a large number of people requesting for this feature. Please help us push this through.

karelz commented 6 years ago

@stephen-lim we are aware that this is in top 2-3 most upvoted issues on CoreFX repo and we repeatedly take it into consideration when planning. If/when we decide to invest in the space, we will update the issue.

remcoros commented 6 years ago

Another use-case I didn't found mentioned in this issue, is, with XPath 2 / XSL 2 support, we are able to use schematron (http://schematron.com/) for xml data / business rules validation.

There's a few .net projects who try to fill this gap, but they lack full schematron support or are outdated and no longer maintained.

With X* 2+ support, we automatically get support for schematron file transformations.

XML / XSD / Schematron is heavily used by standards like UBL (Universal Business Language) and derivatives.

Understandably, implementing this is a major undertaking, but having native support in .NET would be a major win in this space for businesses, who now need to rely on something like Saxion, and/or create a bridge between java tools and libs.

hmobius commented 5 years ago

Adding another comment here because there is still no sign of XPath 3 \ XSLT 3 implementation post .NET Standard 2.0 release. Any further progress @karelz @danmosemsft

danmoseley commented 5 years ago

@hmobius we have no work planned here at this time. The libraries team are working on other things this release such as support for winforms/WPF apps, IoT, ML, JSON, UTF8, updated networking stack, lower allocations, etc. I realize this isn't what you want to hear but we are being transparent about priorities.

CarlosACepeda commented 5 years ago

Any updates on this implementation of XPath 3.0?

karelz commented 5 years ago

Nope, the above still holds. We currently do not have any plans to invets in this area. It may change post-3.0 or later.

esbenbach commented 4 years ago

@stephen-lim we are aware that this is in top 2-3 most upvoted issues on CoreFX repo and we repeatedly take it into consideration when planning. If/when we decide to invest in the space, we will update the issue.

If this is the top 2-3 most upvoted issues in the corefx repo, then why is it not being prioritized before some of all the other stuff that is OUTSIDE the top voted items. Its a bit weird to say the least. We have been asking for this for year - and I guess we all manage without it (we resort to specific java apps to solve our needs most of the time), but it is a bit annoying to have to work around.

karelz commented 4 years ago

Being top voted != guarantee it will be invested into. SW development is more complicated than that and votes are just one angle how to get info about customer needs and prioritize them. Also look above for my explanation of associated costs (super high), security risks and ongoing security maintenance cost - in https://github.com/dotnet/corefx/issues/2295#issuecomment-336193617

It shouldn't be a surprise that similar passion and frustration "why is it not fixed yet" is expressed on almost every high profile issue and on quite a few 2-3 upvoted issues.

Incidentally, we reopened the funding discussion again couple of days ago internally (no guarantee how it will end!) ... just demonstrating we are not ingoring feedback/upvotes, it is just sometimes more involved than one would think.

AbhishekTripathi commented 4 years ago

Given all these fairly high costs and the fact 3rd party solutions exist (which seems to be more than reasonable workaround), I think it is more valuable for BCL team to invest into areas which do not have any existing alternatives yet. At least for now.

The only alternative we could use was Saxon. It is built in Java and uses IKVM to interop with the .net library. Not just it is slow but also that I can't use the dotnet standard/core for my applications. It is not a show stopper but certainly not the desired state to be in.

RobK410 commented 4 years ago

Seriously this got booted to next year/release? SMH

danmoseley commented 4 years ago

I reached out to XmlPrime (https://www.xmlprime.com/xmlprime/) and they confirmed that they have completed .NET Core support now. This is a commercial offering, so this isn't a solution for everyone. If you try this - it would be great to post your results back here to help others in the community.

TsengSR commented 4 years ago

I reached out to XmlPrime (https://www.xmlprime.com/xmlprime/) and they confirmed that they have completed .NET Core support now. This is a commercial offering, so this isn't a solution for everyone. If you try this - it would be great to post your results back here to help others in the community.

Slightly off-topic, but now I'm curious (since we're in urge need of such a thing).

The website doesn't seem updated yet with any new information (or downloads) about that one, but I'd be also interested in information on it, if it has async API what the performance is and if it utilize the new .NET Core APIs such as Span<T>/Memory<T> and/or pipelines? Especially compared to Saxon.NET (via IKVM on the full .NET Framework) and the .NET XSLT Processor ?

And whens that one supposed to get released to the public?

P.S. How about a proposal to acquire this guys? :P

JDziurlaj commented 4 years ago

It seems Microsoft does not invest in APIs where third parties are already providing products in (see SFTP). However I think this is different. XML is a core component used throughout the Microsoft ecosystem, and should be treated as such.

LokiMidgard commented 4 years ago

I think XPath 3 also support JSON which would be a good addition to the new JSON API's.


I would also like to see Visual Studio tools supporting higher versions of XSLT. In the past I had used a 3rd party library for .NetFramework. But for Visual studio constantly complained about the XSLT files in my project since it only understood XSLT 1.1 (I think).

stephen-lim commented 4 years ago

XmlPrime is not usable for many projects. Their licensing is very restrictive and expensive that is unreasonable for many open source projects and small businesses.

Please consider adding support for XSL 3 in .NET core. This is a much requested feature. It's long overdue.

RobK410 commented 4 years ago

My concern with XmlPrime is their website has not been updated since what appears to be 2018 πŸ€·β€β™‚ and direct email to their sales email address has gone unanswered so far. If their responsiveness to a potential sale and their attention to detail in regard to their website content is any indication of their product quality, we should all have some reservations about paying for that product.

michaelhkay commented 4 years ago

Actually, XmlPrime's pricing reflects the cost of producing an advanced piece of technology. Be careful what you ask for: Microsoft's reluctance to implement these standards is strongly affected by (some) users' reluctance to pay for them.

RobK410 commented 4 years ago

What about a cost proposal to work out the code and a "gofundme" campaign to pay for it? I think there's enough demand for it, we all could throw in $100 and this would get done within the year.

JDziurlaj commented 4 years ago

All the XML specifications (save for a few by OASIS) were developed by the W3C and were meant to be part of modern web infrastructure such as web browsers. The shift from XHTML probably hampered that effort, but nonetheless people expected this to be core infrastructure (i.e. part of platforms).

stephen-lim commented 4 years ago

What about a cost proposal to work out the code and a "gofundme" campaign to pay for it? I think there's enough demand for it, we all could throw in $100 and this would get done within the year.

I'm happy to fund the $100 but how do you know it's enough to get developed? I think XSL is not a simple implementation. It takes a lot of hard work to build.

hmobius commented 4 years ago

It seems pretty clear given how long this issue has been around that it really isn't a priority for Microsoft, and that this needs to be an open source effort. It's also clear that implementing XSLT is not a trivial thing. There is a list of projects here but the only one we might be interested in is a form of XPath2.net. Saxon is open source but only in Java so maybe there is scope for a port to .NET rather than the transpiled .NET version currently available. The plus side at least is that the test suite is available as XSLT, XPath (and XQuery) are clearly defined standards.

RobK410 commented 4 years ago

What about a cost proposal to work out the code and a "gofundme" campaign to pay for it? I think there's enough demand for it, we all could throw in $100 and this would get done within the year.

I'm happy to fund the $100 but how do you know it's enough to get developed? I think XSL is not a simple implementation. It takes a lot of hard work to build.

It most certainly would be an effort to get public support for this. I would think you would start with the individuals who up-voted this issue on Microsoft's user voice site. From there, spreading the initiative among .NET user groups, etc. I would think 1,000 devs/companies offering $100 each would do the trick to get the effort underway and to a working beta release. πŸ€·β€β™‚

terrajobst commented 4 years ago

@michaelhkay

Be careful what you ask for: Microsoft's reluctance to implement these standards is strongly affected by (some) users' reluctance to pay for them.

I don't think that's true. Our primary motivations for doing platform features are:

  1. Is this a core concern for many users?
  2. Would adding it to the platform benefit the feature?
  3. Is this a feature that we likely need as a building block for other platform features?

I'm not aware of cases where pricing of external components have influenced our decision; however, the availability of widely used external libraries (commercial or not) does influence our assessment of how beneficial/harmful our involvement would be.

In the case of XSLT 3, I think our interest (or lack of thereof) is informed by the direction of the web/client industry as a whole. Right now, I can't see a world where supporting it would likely become a priority for us.

RobK410 commented 4 years ago

Thanks Immo. So that sums it up gang that's it's likely not going to happen. OSS initiative will be the only solution here. Getting technical expertise and developers to dedicate the effort to implement something similar to Saxon or XmlPrime is a relatively large undertaking.

terrajobst commented 4 years ago

Pretty much, which is why I'm closing this.

jeffska commented 4 years ago

@terrajobst

In the case of XSLT 3, I think our interest (or lack of thereof) is informed by the direction of the web/client industry as a whole. Right now, I can't see a world where supporting it would likely become a priority for us.

I think that's been my frustrations for a long time. In my view XSLT is much less useful for the "traditional" web/client activities than it is for a more generalized standard data transformation framework. I've used XSLT in several project for that type of role, to good effect. However, the restriction of only having XSLT 1.0 as part of the standard environment limits capabilities and further adoption for those other applications. It's a catch-22.

I've been waiting for XSLT > 1.0 for over 10 years now. Sounds like it's still not going to happen in standard libraries.

stephen-lim commented 4 years ago

@michaelhkay

Be careful what you ask for: Microsoft's reluctance to implement these standards is strongly affected by (some) users' reluctance to pay for them.

I don't think that's true. Our primary motivations for doing platform features are:

  1. Is this a core concern for many users?
  2. Would adding it to the platform benefit the feature?
  3. Is this a feature that we likely need as a building block for other platform features?

I'm not aware of cases where pricing of external components have influenced our decision; however, the availability of widely used external libraries (commercial or not) does influence our assessment of how beneficial/harmful our involvement would be.

In the case of XSLT 3, I think our interest (or lack of thereof) is informed by the direction of the web/client industry as a whole. Right now, I can't see a world where supporting it would likely become a priority for us.

If this is the determining factor, then we can argue the case:

  1. Is this a core concern for many users? Yes, XSLT 2+ was one of the top 3 most requested feature back when it was voted through the VisualStudio UserVoice. See archive link has 2817 votes "Implement XSLT 3.0 for .NET"

  2. Would adding it to the platform benefit the feature? Absolutely, there are no available 3rd party open source, free or otherwise affordable solution for open source projects and small businesses. XSLT 2 or 3 brings a wealth of improvement that fixes the shortcomings of XSLT 1.0 increasing productivity.

  3. Is this a feature that we likely need as a building block for other platform features? Yes, XSLT is a standard. It is widely used in:

danmoseley commented 4 years ago

@stephen-lim your examples show that XML and XSLT are widely used but not v3 specifically.

stephen-lim commented 4 years ago

@stephen-lim your examples show that XML and XSLT are widely used but not v3 specifically.

SQL server partially supports Xpath v2. There isn't wide support for v3 because Windows/ASP.NET software like Sharepoint, DNN, Umbraco ultimately rely on the .NET libraries, which only supports XSLT v1. On the other hand, you can find many more examples of v2 and v3 support in Java apps.

The short story is thousands of developers have been asking Microsoft to support v2 for the last 10 years. At one point, Microsoft said they would strongly consider implementing XSLT 2, but that stopped as soon as they started working on LINQ and XQuery. Fast forward today, the v3 spec is out and the hope is that Microsoft should add support for v3, if not v2.

TsengSR commented 4 years ago

@stephen-lim your examples show that XML and XSLT are widely used but not v3 specifically.

Well, to be honest, XSLT 2.0 and XPath 2.0 would be a huge improvement already. XSLT 1.0 is very very limiting (major blockers being lack of user defined functions - You just have templates, but these can't be used as part of XPath Expressions), same applies for XPath 2.0 (Lot of functions missing, no wildcard for Namespaces (i.e. no `/*:elementName``)l

Sure, XPath 3.0 and XSTL 3 would be awesome (i.e. exception throwing and try/catch from XSLT). But XSTL 1.0 is just seriously lacking to much features to really consider it.

I'm rather tempted to extract the whole XSLT processor as an Java-based Microservice, rather than falling back to XSLT 1.0/XPath 1.0 (Saxon.NET via IKVM.NET on .NET Framework is not an option)

michaelhkay commented 4 years ago

As far as Saxonica is concerned, we are eagerly awaiting technical details of what Microsoft is proposing to offer under the "Java interoperability" feature promised in the .NET 5 announcement; that will determine our forwards path for Saxon on .NET. If anyone knows of any details that have been published since the May 2019 announcement, please share!

RobK410 commented 4 years ago

Regarding Saxon and .NET Core, IKVM is obviously shelved. Why not take the runtimes, decomiple using something like DotPeek to C#, and refactor to .NET Core, then implement Saxon to use those libs? I'm sure the IKVM folk wouldn't mind considering they've abandoned ship?

michaelhkay commented 4 years ago

We're looking at a number of options (which is why we really want to know what .NET 5 will offer), but obviously we're very keen to avoid forking the source code.

TsengSR commented 4 years ago

Regarding Saxon and .NET Core, IKVM is obviously shelved. Why not take the runtimes, decomiple using something like DotPeek to C#, and refactor to .NET Core

Not sure what you mean. IKVM.NET is open source... there is just no one to take it over. IKVM.NET author already offered others to take over the project under the condition it's renamed to something else.

But not sure how much sense that makes anyways, since (as far as I know) it required a lot of changes for each new JRE version, which now ship bi-annually rather than once every 3-5 years

hmobius commented 4 years ago

@michaelhkay Completely understand about forking the source code but I for one would be very interested in working on a port around XSLT\XPath in .NET using the new features we have in C#. I'm curious if rather than forking the source code, we could fork \ port the code for the test suite and work from there.

michaelhkay commented 4 years ago

There are good test suites for XSLT 3.0, XPath 3.1, and XQuery 3.1 on GitHub, and we're happy to share our test drivers. The bulk of the test material is in XML files and is 100% portable; creating a test driver to run the tests on a particular platform is a fairly trivial exercise. The only other requirement is API testing, which is specific to each platform/API/language-binding. But the source code for the product itself is 600K lines of Java so that's a major undertaking.

JDziurlaj commented 4 years ago

I would assume the problem with SAXON isn't so much the source code being in Java, as it having dependencies on third party Java libraries, which may be hard to decouple.

michaelhkay commented 4 years ago

No, that's not the case. Saxon's dependencies on third party (non-JDK) libraries are very easily isolated and decoupled. Where such dependencies exist (e.g on the ICU-J library) you can either port the third party code as if it were part of Saxon, or you can make do without it.

kant2002 commented 4 years ago

There are good test suites for XSLT 3.0, XPath 3.1, and XQuery 3.1 on GitHub, and we're happy to share our test drivers. The bulk of the test material is in XML files and is 100% portable; creating a test driver to run the tests on a particular platform is a fairly trivial exercise. The only other requirement is API testing, which is specific to each platform/API/language-binding. But the source code for the product itself is 600K lines of Java so that's a major undertaking.

Where the test cases? Can you give a link?

michaelhkay commented 4 years ago

https://github.com/w3c/xslt30-test (XSLT 3.0) https://github.com/w3c/qt3tests (XQuery 3.1, XPath 3.1) https://github.com/w3c/xsdtests (XSD 1.1)

In each case the test suites also include tests for earlier versions, labelled as such in the test metadata.

am11 commented 4 years ago

+1, spec suites are the way to go for realistic and reliable conformance testing. I have some experience with writing spec suite adapter for Sass' and YAML's .NET implementations. If the porting effort transpires out in open, I am willing to contribute. :)

MicahEdwards commented 4 years ago

I reached out to XmlPrime (https://www.xmlprime.com/xmlprime/) and they confirmed that they have completed .NET Core support now. This is a commercial offering, so this isn't a solution for everyone. If you try this - it would be great to post your results back here to help others in the community.

A .Net Core trial version of XmlPrime 4.1.3 is now available as a signed NuGet package.

Just send me a message or drop us an email ( info@xmlprime.com ) saying what area you would like to test it in and we will send you a download link.

Micah Edwards. XmlPrime.

DanAtkinson commented 4 years ago

@MicahEdwards What are the costs for the full product? You don't display them online. Some pages say that I can purchase licenses online but then when I go to those pages, I'm told that I can't purchase it online.

So I guess the simple question is where can I see a breakdown of your prices? I shouldn't need to contact you to get these - they should just be available on your website.