eXist-db / expath-crypto-module

expath
GNU Lesser General Public License v2.1
1 stars 9 forks source link

Investigate which spec this package implements and identify deviations #41

Open joewiz opened 3 years ago

joewiz commented 3 years ago

Is your feature request related to a problem? Please describe.

The EXPath Crypto spec has at least two significant versions: v1.0, dated 14 Feb 2015 and an unnumbered version, dated 20 Mar 2017. Users and maintainers need to have a clear sense of which spec this package implements and what deviations, if any, there are in eXist's implementation. The README references both versions of the spec but refers to the 2nd version as "the latest version of this specification for this module" and says, "The implementation follows this specification." (For the spec's sources, see https://github.com/expath/expath-cg/tree/master/specs/crypto.)

However, it appears there are deviations between the spec and eXist's implementation. The test suite references error codes that are in neither version of the spec. There are mysterious fragments in the test suite regarding keystores - not mentioned in the latest spec. The README lists "currently implemented functions," but the listed limitations do not clearly align with the function documentation.

To disentangle these issues and clarify what users can use, we should investigate which spec is currently implemented and what, if any, actions might be needed to align the package and the specification. (Perhaps we might even identify improvements needed in the specification and ways to better align with the BaseX implementation—itself "based on an early draft" of the EXPath spec. See the latest discussion at https://github.com/expath/expath-cg/issues/132.)

Describe the solution you'd like

joewiz commented 3 years ago

In this post I look at each of the functions that appear in the latest spec and compare the function signatures to the corresponding functions in eXist's implementation, as well as the README's remarks, the test suite, and the BaseX implementation.

crypto:hash

latest spec

hash($data      as xs:anyAtomicType,
     $algorithm as xs:string) as xs:string

hash($data      as xs:anyAtomicType,
     $algorithm as xs:string,
     $encoding  as xs:string) as xs:string

eXist function documentation

hash($data      as xs:anyType, 
     $algorithm as xs:string) as xs:byte*

hash($data      as xs:anyType, 
     $algorithm as xs:string, 
     $encoding  as xs:string) as xs:byte*

divergences

crypto:hmac

latest spec

hmac($data       as xs:anyAtomicType,
     $key        as xs:anyAtomicType,
     $algorithm  as xs:string) as xs:byte*

hmac($data       as xs:anyAtomicType,
     $key        as xs:anyAtomicType,
     $algorithm  as xs:string,
     $encoding   as xs:string) as xs:string

eXist function documentation

hmac($data      as xs:anyAtomicType*, 
     $key       as xs:anyAtomicType*, 
     $algorithm as xs:string) as xs:byte*

hmac($data      as xs:anyAtomicType*, 
     $key       as xs:anyAtomicType*, 
     $algorithm as xs:string, 
     $encoding  as xs:string) as xs:byte*

divergences

crypto:generate-signature

latest spec

crypto:generate-signature($data       as document()?,
                          $parameters as map(xs:string, item()+)?) as document()* 

eXist function documentation

generate-signature($data                       as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string) as item()

generate-signature($data as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string, 
                   $xpath-expression           as xs:anyType) as item()

generate-signature($data as item(), 
                   $canonicalization-algorithm as xs:string, 
                   $digest-algorithm           as xs:string, 
                   $signature-algorithm        as xs:string, 
                   $signature-namespace-prefix as xs:string, 
                   $signature-type             as xs:string, 
                   $xpath-expression           as xs:anyType, 
                   $digital-certificate        as xs:anyType) as item()

generate-signature($data as item(), 
                   $private-key                as xs:string, 
                   $signature-algorithm        as xs:string) as item()

divergences

crypto:validate-signature

latest spec

crypto:validate-signature($data as document()) as xs:boolean

eXist function documentation

validate-signature($data as node()) as xs:boolean

divergences

crypto:encrypt

latest spec

encrypt($data       as xs:anyAtomicType,
        $type       as xs:string,
        $parameters as map(xs:string, item())?) as xs:base64Binary

function documentation

encrypt($data            as xs:anyAtomicType, 
        $encryption-type as xs:string, 
        $secret-key      as xs:string, 
        $algorithm       as xs:string, 
        $iv              as xs:string?, 
        $provider        as xs:string?) as xs:string

divergences

crypto:decrypt

latest spec

decrypt($data       as xs:anyAtomicType,
        $type       as xs:string,
        $parameters as map(xs:string, item())?) as xs:string

eXist function documentation

decrypt($data as xs:anyAtomicType, 
        $decryption-type as xs:string, 
        $secret-key as xs:string, 
        $algorithm as xs:string, 
        $iv as xs:string?, 
        $provider as xs:string?) as xs:string

divergences

other notes

joewiz commented 3 years ago

In conclusion, there are numerous minor divergences between the latest spec and the eXist implementation.

There is a surprising number of differences in parameter types and cardinalities in function signatures. Questions for further investigation: Which of these, if any, are significant? (The xs:string vs. xs:byte return types for crypto:hash and crypto:hmac seem significant; c.f. BaseX's use of xs:base64Binary for the return type of its hashing functions.) And would updating eXist's implementation to match the spec break tests or fix them (and thus code using these functions)?

The biggest difference between the latest spec and the eXist implementation is in the cases of functions where the spec uses a map for parameters, whereas eXist (and BaseX) do not appear to support this and instead use multi-parameter function signatures. How to reconcile this difference between the spec and the implementations is perhaps the biggest glaring issue.

The test suite is another window into what eXist supports (alongside the function signatures). Without knowing which tests previously passed, it's hard to say which test failures represent a regression with the new 6.0.0-SNAPSHOT version. A lot of work appears to remain in investigating the failing tests.

The low hanging fruit which would help fix 3 test failures are the error messages, which still use a pre-1.0 draft set of error codes (seen in the BaseX documentation). Updating the error codes in the 3 tests with %test:assertErrror should fix those. That would leave only 10 failures.