erasmus-without-paper / ewp-specs-api-iias

Specifications of EWP's Interinstitutional Agreements API.
MIT License
4 stars 13 forks source link

XSLT transformation produces identical results for two different inputs #134

Closed jiripetrzelka closed 10 months ago

jiripetrzelka commented 1 year ago

Referrring to: https://github.com/erasmus-without-paper/ewp-specs-api-iias/blob/v7.0.0/resources/xsltKit/transform_version_7.xsl

First snippet:

<recommended-language-skill>
  <language>en</language>
  <cefr-level>B1</cefr-level>                   
  <subject-area>
    <isced-f-code>0314</isced-f-code>
  </subject-area>                     
</recommended-language-skill>

Second snippet:

<recommended-language-skill>
  <language>en</language>
  <cefr-level>B1</cefr-level>                                      
</recommended-language-skill>

<subject-area>
  <isced-f-code>0314</isced-f-code>
</subject-area> 

Both snippets produce _en__B1__0314_


First snippet:

<mobilities-per-year not-yet-defined="true">2</mobilities-per-year>
<total-months-per-year>5</total-months-per-year>

Second snippet: <mobilities-per-year>5</mobilities-per-year>

Both produce _5_

This should not be the case.

mkurzydlowski commented 1 year ago

You are right! We will need to update the XSLT.

demilatof commented 12 months ago

@jiripetrzelka You're right, these cases were not noticed and the XSLT was a preliminary version. This XSLT (for version 7) should do the trick, adding the terminated-as-a-whole and introducing the node/attribute names in the text to hash up to three levels above (this should be enough). The version 6 can be adjusted in a similar way.

Edit: I updated this message with the XSLT that should solve even the problem of sub-elements of a not-yet-defined element.

Obviously I would really appreciate if someone else will participate in writing corrections or in rewriting of a better XSLT.

Please check the following proposal of a new XSLT version:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:func="http://exslt.org/functions"
                xmlns:ewp="http://example.org"
                exclude-result-prefixes="ewp"
                extension-element-prefixes="func"
>

  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

  <!-- Enclose the value of an element between two markers (_ and _) -->
  <xsl:function name="ewp:dumpElement">
    <xsl:param name="elValue"/>
    <xsl:param name="elName"/>
    <xsl:value-of select="concat('_', $elName, '=', $elValue,'_')"/>
  </xsl:function>

  <!-- Enclose the value of an attribute between two markers (_@ and @_) -->
  <xsl:function name="ewp:dumpAttribute">
    <xsl:param name="attValue"/>
    <xsl:param name="attName"/>
    <xsl:value-of select="concat('_@', $attName, '=', $attValue,'@_')"/>
  </xsl:function>

  <!-- Return the terminated-as-a-whole value only if its value is true -->
  <xsl:function name="ewp:dumpTerminated">
    <xsl:param name="attValue"/>
    <xsl:if test="$attValue and ($attValue='true' or $attValue='1')">_@terminated-as-a-whole@_</xsl:if>
  </xsl:function>

  <xsl:template match="/">

    <xsl:for-each select="//*[local-name()='iia']">
      <iia>
        <iia-id>
          <xsl:value-of select="*[local-name()='partner'][1]/*[local-name()='iia-id']"/>
        </iia-id>

        <text-to-hash>
          <xsl:value-of select="ewp:dumpTerminated(//*[local-name()='cooperation-conditions']/@*[local-name()='terminated-as-a-whole'])"/>
          <xsl:for-each select="*[local-name()='partner']">
            <xsl:value-of select="ewp:dumpElement(*[local-name()='iia-id'], concat('iia-id', '_', position()))"/>
          </xsl:for-each>

          <xsl:for-each select="*[local-name()='cooperation-conditions']/*"> <!-- for each mobilities -->
            <xsl:for-each select=".//*">  <!-- iterate over the inside elements -->
              <!-- but excluding receiving/sending contacts (and their children) and receiving-academic-year-id -->
              <xsl:variable name="notYetDefinedFound" select="count(ancestor-or-self::*[local-name()][@not-yet-defined='true' or @not-yet-defined='1'])" />
              <xsl:if test="local-name() != 'sending-contact' and 
                                local-name() != 'receiving-contact' and
                                local-name(..) != 'sending-contact' and 
                                local-name(..) != 'receiving-contact' and
                                local-name() != 'receiving-first-academic-year-id' and
                                local-name() != 'receiving-last-academic-year-id' and
                                $notYetDefinedFound =0 "
              > 
                <xsl:for-each select="@* except @not-yet-defined"> <!-- the value of all the attributes -->
                    <xsl:value-of select="ewp:dumpAttribute(., concat(local-name(../../..), '.',
                                            local-name(../..), '.', 
                                            local-name(..), '.', 
                                            local-name(.)))"/>
                </xsl:for-each>
                <xsl:if test="count(*)=0"> <!-- process the element only if it has no children -->                  
                  <xsl:value-of select="ewp:dumpElement(., concat(local-name(../..), '.',
                                    local-name(..), '.', 
                                    local-name(.)))"/> <!-- the element value -->
                </xsl:if>
              </xsl:if>
            </xsl:for-each>
            <!-- finally the first and the last academic year, as per version 7 -->
            <xsl:value-of select="ewp:dumpElement(*[local-name()='receiving-first-academic-year-id'], 'receiving-first-academic-year-id')"/>
            <xsl:value-of select="ewp:dumpElement(*[local-name()='receiving-last-academic-year-id'],'receiving-last-academic-year-id')"/>

          </xsl:for-each>
        </text-to-hash>

        <!-- If it is present at least an attribute not-yet-defined=true, the IIA is not valid for an approval -->
        <xsl:for-each select="//*[local-name()][@not-yet-defined='true']">
            <xsl:choose>
                <xsl:when test="position() = 1">
                    <valid-for-approval>false</valid-for-approval>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>

      </iia>
    </xsl:for-each>    
  </xsl:template>

</xsl:stylesheet>
mkurzydlowski commented 11 months ago

Please review proposed changes to the templates: https://github.com/erasmus-without-paper/ewp-specs-api-iias/compare/xslt-fix

jiripetrzelka commented 11 months ago

It will still not work if there is the other-format inside phone-number because the "if" only looks one level up: local-name(..) != 'sending-contact'

<iias:cooperation-conditions>
  <iias:student-studies-mobility-spec>
    <iias:sending-contact>
      <ns3:contact-name>XYZ</ns3:contact-name>
      <ns2:phone-number>
        <ns2:other-format>123456789</ns2:other-format>
      </ns2:phone-number>

Result: _sending-contact.phone-number.other-format=123456789_

mkurzydlowski commented 11 months ago

@jiripetrzelka, you are right. I will fix this.

@demilatof, can you point be to an XML attribute inside cooperation-conditions element (apart from the terminated-as-a-whole). I couldn't not find any, so maybe we might simplify the XSLT template?

PS. I will also simplify the XSLT to only handle true for boolean attributes. 1 should not be used, in my opinion, so handling it in the XSLT would be misleading.

demilatof commented 11 months ago

It will still not work if there is the other-format inside phone-number because the "if" only looks one level up: local-name(..) != 'sending-contact'

<iias:cooperation-conditions>
  <iias:student-studies-mobility-spec>
    <iias:sending-contact>
      <ns3:contact-name>XYZ</ns3:contact-name>
      <ns2:phone-number>
        <ns2:other-format>123456789</ns2:other-format>
      </ns2:phone-number>

Result: _sending-contact.phone-number.other-format=123456789_

Hi @jiripetrzelka , many thanks for your feedback; maybe that it could work replacing this block


local-name() != 'sending-contact' and 
local-name() != 'receiving-contact' and
local-name(..) != 'sending-contact' and 
local-name(..) != 'receiving-contact' and

with this one?

count(ancestor::*[local-name() = 'sending-contact'])=0 and 
count(ancestor::*[local-name() = 'receiving-contact'])=0 and

So now, we should have:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:func="http://exslt.org/functions"
                xmlns:ewp="http://example.org"
                exclude-result-prefixes="ewp"
                extension-element-prefixes="func"
>

  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

  <!-- Enclose the value of an element between two markers (_ and _) -->
  <xsl:function name="ewp:dumpElement">
    <xsl:param name="elValue"/>
    <xsl:param name="elName"/>
    <xsl:value-of select="concat('_', $elName, '=', $elValue,'_')"/>
  </xsl:function>

  <!-- Enclose the value of an attribute between two markers (_@ and @_) -->
  <xsl:function name="ewp:dumpAttribute">
    <xsl:param name="attValue"/>
    <xsl:param name="attName"/>
    <xsl:value-of select="concat('_@', $attName, '=', $attValue,'@_')"/>
  </xsl:function>

  <!-- Return the terminated-as-a-whole value only if its value is true -->
  <xsl:function name="ewp:dumpTerminated">
    <xsl:param name="attValue"/>
    <xsl:if test="$attValue and ($attValue='true' or $attValue='1')">_@terminated-as-a-whole@_</xsl:if>
  </xsl:function>

  <xsl:template match="/">

    <xsl:for-each select="//*[local-name()='iia']">
      <iia>
        <iia-id>
          <xsl:value-of select="*[local-name()='partner'][1]/*[local-name()='iia-id']"/>
        </iia-id>

        <text-to-hash>
          <xsl:value-of select="ewp:dumpTerminated(//*[local-name()='cooperation-conditions']/@*[local-name()='terminated-as-a-whole'])"/>
          <xsl:for-each select="*[local-name()='partner']">
            <xsl:value-of select="ewp:dumpElement(*[local-name()='iia-id'], concat('iia-id', '_', position()))"/>
          </xsl:for-each>

          <xsl:for-each select="*[local-name()='cooperation-conditions']/*"> <!-- for each mobilities -->
            <xsl:for-each select=".//*">  <!-- iterate over the inside elements -->
              <!-- but excluding receiving/sending contacts (and their children) and receiving-academic-year-id -->
              <xsl:variable name="notYetDefinedFound" select="count(ancestor-or-self::*[local-name()][@not-yet-defined='true' or @not-yet-defined='1'])" />
              <xsl:if test="count(ancestor::*[local-name() = 'sending-contact'])=0 and 
                                count(ancestor::*[local-name() = 'receiving-contact'])=0 and
                                local-name() != 'receiving-first-academic-year-id' and
                                local-name() != 'receiving-last-academic-year-id' and
                                $notYetDefinedFound =0 "
              > 
                <xsl:for-each select="@* except @not-yet-defined"> <!-- the value of all the attributes -->
                    <xsl:value-of select="ewp:dumpAttribute(., concat(local-name(../../..), '.',
                                            local-name(../..), '.', 
                                            local-name(..), '.', 
                                            local-name(.)))"/>
                </xsl:for-each>
                <xsl:if test="count(*)=0"> <!-- process the element only if it has no children -->                  
                  <xsl:value-of select="ewp:dumpElement(., concat(local-name(../..), '.',
                                    local-name(..), '.', 
                                    local-name(.)))"/> <!-- the element value -->
                </xsl:if>
              </xsl:if>
            </xsl:for-each>
            <!-- finally the first and the last academic year, as per version 7 -->
            <xsl:value-of select="ewp:dumpElement(*[local-name()='receiving-first-academic-year-id'], 'receiving-first-academic-year-id')"/>
            <xsl:value-of select="ewp:dumpElement(*[local-name()='receiving-last-academic-year-id'],'receiving-last-academic-year-id')"/>

          </xsl:for-each>
        </text-to-hash>

        <!-- If it is present at least an attribute not-yet-defined=true, the IIA is not valid for an approval -->
        <xsl:for-each select="//*[local-name()][@not-yet-defined='true']">
            <xsl:choose>
                <xsl:when test="position() = 1">
                    <valid-for-approval>false</valid-for-approval>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>

      </iia>
    </xsl:for-each>    
  </xsl:template>

</xsl:stylesheet>
demilatof commented 11 months ago

@jiripetrzelka, you are right. I will fix this.

@demilatof, can you point be to an XML attribute inside cooperation-conditions element (apart from the terminated-as-a-whole). I couldn't not find any, so maybe we might simplify the XSLT template?

I don't know why I've not submitted yesterday the above answer. Maybe it is the solution are you looking for? Could you better explain what do you mean? I'm not sure I've understood well.

PS. I will also simplify the XSLT to only handle true for boolean attributes. 1 should not be used, in my opinion, so handling it in the XSLT would be misleading.

My first release didn't consider '1' but I read that terminated-as-a-whole, for example, is declared as "xs:boolean". And by W3C specification:

Note: Legal values for boolean are true, false, 1 (which indicates true), and 0 (which indicates false).

Therefore, it would be wrong not consider 1 as legal value for "true". I use "true" in my code too, but to respect the XML specification we have to accept that someone could use '1'.

mkurzydlowski commented 11 months ago

Could you better explain what do you mean? I'm not sure I've understood well.

In short I'm asking if this fragment is needed:

<xsl:for-each select="@* except @not-yet-defined"> <!-- the value of all the attributes -->
    <xsl:value-of select="ewp:dumpAttribute(., concat(local-name(../../..), '.',
          local-name(../..), '.', 
          local-name(..), '.', 
          local-name(.)))"/>
</xsl:for-each>

Are there attributes in the cooperation-conditions element that are used in the hash?

Note: Legal values for boolean are true, false, 1 (which indicates true), and 0 (which indicates false).

Therefore, it would be wrong not consider 1 as legal value for "true". I use "true" in my code too, but to respect the XML specification we have to accept that someone could use '1'.

You are right, that we are not entitled to exclude '1' and '0'. Thanks for pointing this out!

https://www.w3.org/TR/xmlschema-2/#boolean

demilatof commented 11 months ago

In short I'm asking if this fragment is needed:

<xsl:for-each select="@* except @not-yet-defined"> <!-- the value of all the attributes -->
    <xsl:value-of select="ewp:dumpAttribute(., concat(local-name(../../..), '.',
          local-name(../..), '.', 
          local-name(..), '.', 
          local-name(.)))"/>
</xsl:for-each>

Are there attributes in the cooperation-conditions element that are used in the hash?

Generally speaking, it's better to have this fragment in case of an optional attribute will be added in future. Currently, I see we have v6-value for ISCED. I'm not sure that it's useful, and could even trigger a re-approval with this XSLT. But to remove completely this fragment you should be sure that we don't use any other attribute than "not-yet-defined" and "v6-value" both now and in the future, at least for v7. Even in complex type, elsewhere defined and used inside get-response.xsd

As concern v6-value, do we really need it? I know if my partner had previously a two digits ISCED code and as I understood this is a not acceptable value even if there is a valid mutual approval. This should be the only case where a mutual approval doesn't overcome an obsolete data.

Therefore, before migrating to v7, all the HEIs that use a 2 digits ISCED should correct it to full fill the v7 specifications. And a new approval should be triggered because the XSLT v7 sees a 4 digits ISCED whilst XSLT v6 sees a 2 digits ISCED. Am I wrong?

mkurzydlowski commented 11 months ago

OK. We might leave this attribute fragment. Please review the updated proposal: https://github.com/erasmus-without-paper/ewp-specs-api-iias/compare/xslt-fix

demilatof commented 11 months ago

I'll check it later. Please consider this scenario with v6-value:

  1. One partner (A) sends IIA with the new ISCED=0912 and v6-value=09 (old ISCED code); v6-value is included in hash code
  2. After that the two partners (A and B) mutually approve the IIA, v6-value must be removed from the IIA
  3. B fetches IIA from A, it's the same but now it has no more the v6-value; the hash code is changed and the IIA should be re-approved. But nothing has been really changed.
mkurzydlowski commented 11 months ago

Step 2 - removal of v6-value and mutual approval should only take place if the IIA has changed after version 6:

This attribute:
   * MUST be set only for mutually approved agreements that
     have been approved in the IIA version 6 and have not been modified since,
   * MUST be set only for agreements that had ISCED code with less than 4 digits,
   * MUST NOT be set otherwise.
demilatof commented 11 months ago

Step 2 - removal of v6-value and mutual approval should only take place if the IIA has changed after version 6:

Yes, this is the problem (not mine, we have been using a 4 digits code for years). I've just seen that you have extended ISCED, but I need an example of how we have to use that attribute inside the subject area, because if I write:

<subject-area>
   <isced-f-code v6-value="03">0314</isced-f-code>
</subject-area>

it's valid, whilst if I write

<subject-area>
  <isced-f-code v6-value="03">03</isced-f-code>
</subject-area>

it's invalid.

mkurzydlowski commented 11 months ago

What example do you want me to provide? The first fragment is OK, as you already stated and the second one is not valid.

demilatof commented 11 months ago

What example do you want me to provide? The first fragment is OK, as you already stated and the second one is not valid.

Well, it's as I understood, then. What I don't understand is

This attribute:

  • MUST be set only for mutually approved agreements that have been approved in the IIA version 6 and have not been modified since

If I fetch an IIA v7, the ISCED code MUST be modified to be compliant; so my partner has or not to use v6-value?

mkurzydlowski commented 11 months ago

The IIA is not modified from the business perspective and that is what this sentence was referring to. Of course, technically IIA v7 XML will be different than IIA v6 XML, but this is not what the "modified" word is referring to.

demilatof commented 11 months ago

The IIA is not modified from the business perspective and that is what this sentence was referring to. Of course, technically IIA v7 XML will be different than IIA v6 XML, but this is not what the "modified" word is referring to.

I think it should be better explained. Anyway, what does it mean from a practical point of view? Should I fill the isced-code with whatever value to have a 4 digit code, add v6-value attribute (with the old 2 digit code) and then exchange it?

mkurzydlowski commented 11 months ago

Exactly so.

demilatof commented 11 months ago

Exactly so.

Ok, then we can come back to my observation.

The XSLT doesn't consider the v6-value you have introduced, therefore the XSLT v6 produces a text to be hashed different from the one produced by XSLT v7, because:

We have to view the problem from the technical perspective more than from Business perspective. So what should have to do? The XSLT v7 finds the v6-value, ignore the attribute and use this value instead the real 4 digit ISCED code, so that it can be compared with the old one? And then flag as "not ready for the approval", because there is still the v6-value attribute?

Or we can approve an amendment even if there is the v6-value attribute? Because in this case we approve an IIA that has a hash code different from the next IIA that is identical but without the v6-value because approved at v7 level.

In my opinion, the v6-value only over complicates the problem, given that a 2 digit code is invalid and cannot be kept as it is.

mkurzydlowski commented 11 months ago

Exactly so.

Ok, then we can come back to my observation.

But I already answered that step 2 is wrong: https://github.com/erasmus-without-paper/ewp-specs-api-iias/issues/134#issuecomment-1738949888

The XSLT doesn't consider the v6-value you have introduced

What do you mean? It is used in XSLT v7.

therefore the XSLT v6 produces a text to be hashed different from the one produced by XSLT v7, because:

It doesn't, both XSLT v6 and XSLT v7 use the same ISCED code to calculate the hash.

  • in IIA v6 we have a 2 digit ISCED code whilst in IIA v7 we have a 4 digit ISCED code
  • in IIA v7 we have an extra attribute (v6-value) that is not present in IIA v6

That's all true but the XSLT manages this and calculates the same hash.

The XSLT v7 finds the v6-value, ignore the attribute and use this value instead the real 4 digit ISCED code, so that it can be compared with the old one? And then flag as "not ready for the approval", because there is still the v6-value attribute?

That's true.

Or we can approve an amendment even if there is the v6-value attribute? Because in this case we approve an IIA that has a hash code different from the next IIA that is identical but without the v6-value because approved at v7 level.

I'm lost here, but you need to approve an IIA v7 only if it is a new IIA or a modified IIA. In both cases it shoudln't contain v6-value.

demilatof commented 11 months ago

The XSLT doesn't consider the v6-value you have introduced

What do you mean? It is used in XSLT v7.

I mean that the v6-value is an attribute used in IIA v7 that isn't explicitly managed by XSLT.

It doesn't, both XSLT v6 and XSLT v7 use the same ISCED code to calculate the hash.

I don't think so... Once we have v7 in PROD, v6 API is switched off and we have to work with the snapshot, that has a 2 digit code and that we cannot edit. We apply the XSLT v6 to the snapshot (that has a 2 digit code) and we compare it with the result of the XSLT v7 applied to the IIA v7 received by the partner, to check if the hash code is changed or not. And we find that it is changed, because there is a 4 digit code, compared with the hash code we have computed for the snapshot, that has a 2 digit code.

Unless the XSLT doesn't replace in the IIA v7 the value of "isced-f-code" with the value of the attribute v6-value and then compute the text to be hashed. Only in this case, we can have the same hash.

The XSLT v7 finds the v6-value, ignore the attribute and use this value instead the real 4 digit ISCED code, so that it can be compared with the old one? And then flag as "not ready for the approval", because there is still the v6-value attribute?

That's true.

Have you implemented this step? My XSLT version didn't take care of it yet.

I'm lost here, but you need to approve an IIA v7 only if it is a new IIA or a modified IIA. In both cases it shoudln't contain v6-value.

Ok, if it's so, then the following last fragment must be changed to include even the v6-value:

       <!-- If it is present at least an attribute not-yet-defined=true, the IIA is not valid for an approval -->
        <xsl:for-each select="//*[local-name()][@not-yet-defined='true']">
            <xsl:choose>
                <xsl:when test="position() = 1">
                    <valid-for-approval>false</valid-for-approval>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>
mkurzydlowski commented 11 months ago

Unless the XSLT doesn't replace in the IIA v7 the value of "isced-f-code" with the value of the attribute v6-value and then compute the text to be hashed. Only in this case, we can have the same hash.

But it does! That's what this fragment is about:

  <!-- Correct the isced with its old value if the attribute v6-value is present-->
  <xsl:function name="func:iscedPatch">

Ok, if it's so, then the following last fragment must be changed to include even the v6-value:

       <!-- If it is present at least an attribute not-yet-defined=true, the IIA is not valid for an approval -->
        <xsl:for-each select="//*[local-name()][@not-yet-defined='true']">
          <xsl:choose>
              <xsl:when test="position() = 1">
                  <valid-for-approval>false</valid-for-approval>
              </xsl:when>
          </xsl:choose>
      </xsl:for-each>

That's true!

demilatof commented 11 months ago

Unless the XSLT doesn't replace in the IIA v7 the value of "isced-f-code" with the value of the attribute v6-value and then compute the text to be hashed. Only in this case, we can have the same hash.

But it does! That's what this fragment is about:

  <!-- Correct the isced with its old value if the attribute v6-value is present-->
  <xsl:function name="func:iscedPatch">

I apologize, too many projects at the same time, I forgot it; I was looking at the XSLT I posted above and I forgot the other one. And I wrote it by myself... shame on me, sorry :disappointed:

Then the value is correct, just update the last fragment, as you agree with:

Ok, if it's so, then the following last fragment must be changed to include even the v6-value:

       <!-- If it is present at least an attribute not-yet-defined=true, the IIA is not valid for an approval -->
        <xsl:for-each select="//*[local-name()][@not-yet-defined='true']">
            <xsl:choose>
                <xsl:when test="position() = 1">
                    <valid-for-approval>false</valid-for-approval>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>

That's true!

As soon as I've a moment, I rewrite the last fragment

mkurzydlowski commented 11 months ago

@demilatof, please tak a look at the GitHub version of the XSLT or even better at the updated one. The v6-value is already taken into account.

I suppose that the XSLT you are working on is based on an older version of the template.

janinamincer-daszkiewicz commented 10 months ago

https://github.com/erasmus-without-paper/ewp-specs-api-iias/compare/xslt-fix