gruninger / Common-Logic

Documents for the developments of ISO 24707 Editiion 2 (Common Logic)
8 stars 3 forks source link

Correction and completion of the XML syntax in 24707 Annex C #22

Open greenTara opened 11 years ago

greenTara commented 11 years ago

The XCL entries of the CL Defect Report [1] http://cl.tamu.edu/docs/cl/24707-defect-report.pdf give a very incomplete picture of the problems with this first version of XCL (XCL1).

The additional serious XCL1 defects may be divided into two groups:

A. serious defects in the schema design B. serious defects in the syntax design

A. The XCL1 DTD as published in the CL specification is not modular, and so does not distinguish some subset of the syntax which could, in principle, be exactly semantically conformant. All schema languages allow such modularization, and we propose that the revised XCL schema be divided into multiple parts, a core schema and one or more standard extension modules. This is easily accomplished with Relax NG [2] http://www.ccil.org/%7Ecowan/relaxng.pdf, and is also possible in XSD and DTD.

There is in actuality no subset of XCL1 that is exactly semantically conformant, so additional modifications, as described in part B, will be necessary to create a core XCL2 syntax that is exactly semantically conformant.

There are certain features, such as typed literals, that are easily implemented (syntactically) in XML and that would be useful to most applications, but are not (yet) part of the abstract CL syntax. These features would be implemented in the standard extension modules (with semantics of typed literals according to the outcome of the parallel discussion on this topic.)

Note: Let us emphasize the difference between such standard extensions of the XCL syntax versus XCL extensions external to the CL standard that might introduce, e.g., non-monotonic logic.

B. We found a basic flaw in the design of XCL1 consisting of the introduction of elements for higher-level syntactic categories that should instead be non-terminals that are not shown in XML-instance serializations (such 'invisible' non-terminals are implemented as entities in DTD, patterns in Relax NG and groups in XSD). This flaw is repeated multiple places (, , ) and causes multiple problems.

<text> - This element serves as a root element, and also for the syntactic categories of named text and unnamed text. (We don't distinguish commented text as a syntactic category, as just about anything can have an attached comment.) The problems caused by this overloading of are:

  1. Coupling of logical and physical structure. The root element has semantics as identifying a text rather than being purely a syntactic wrapper. Because only one root element is allowed in an XML file, this means a file is limited to holding one text. This restriction causes unnecessary coupling between the logical structure of texts and the physical structure of files, whereas it may be desirable to have multiple texts in a file. It would be better to have a purely syntactic root element (say ), and let elements such as be children of this root.
  2. Lack of Conformance. Named texts and unnamed texts have different syntactic patterns in CL.

Named texts cannot be nested within any other element, while unnamed texts can be nested within other texts and modules. For conformance, it is therefore necessary to have two different syntactic categories for texts, and it would be best to indicate these categories explicitly by the name of their elements. In XCL1, nesting of texts within other texts and within modules is not implemented at all, and so XCL1 is not exactly conformant as a CL dialect.

  1. Lack of Extensibility - languages which extend XCL to include things such as performatives for knowledge base management need clear syntactic categories at the upper level for encapsulating sets of phrases, which is not provided by XCL1.

<phrase> - This element is completely unnecessary. Instead, "phrase" should be a (serialization-)invisible non-terminal which is a choice among the non-terminals for sentence, unnamed text, module and importation. Problems caused by this include:

  1. Redundancy - Unnecessary elements make the syntax more verbose and cumbersome than necessary.
  2. Lack of Conformance. In XCL1, the content model of is a sentence with optional comments. This is not in agreement with the CL abstract syntax, where a phrase can be other things besides a sentence, and there is no "commented phrase" specified in the abstract syntax.
  3. Lack of Extensibility - Extensions may want to introduce other kinds of phrases, e.g. defeasible rules, reaction rules, metalogical statements. Without a "phrase" non-terminal, it is not possible to extend the schema in this way.

<term> - In XCL1, this element is used to denote names of all kinds and also functional terms. Instead, "term" should be an invisible non-terminal which is a choice among non-terminals for the syntactic categories name and functional term, resp. The syntactic category "name" could be further expanded in extensions to cover both interpretable and fixed-interpretation names. Problems caused by this overloading of

include: 1. Schema Inadequacy - the exclusion list of a module should contain only a list of names, but in XCL1 may contain functional terms, e.g. ``` xml ... ``` 1. Syntactic Disconnect - for quantifier bindings, a new element was invented to avoid repeating the Schema Inadequacy problem of the previous item, that is allowing a quantifier binding to be a functional term. This creates a syntactic disconnect between the binding and the terms in the quantified sentence. For example, we must make "x" both a and a when we state ``` xml ``` 1. Semantic Ambiguity - Few schema languages allow a choice between attributes and content (Relax NG is one, but not XSD or DTD), so in order for to serve double-duty for names and functional terms, it is allowed to have _both_ a name attribute and the child element structure of functional terms. This introduces semantic ambiguity- when both occur, which takes precedence? ``` xml ``` 1. Lack of Extensibility. Fixed-interpretation names should not be allowed in quantifier bindings, and there is no clear mechanism for extending the syntax to implement this properly. Other sorts of terms may be desired (IKL's "that", reifications, frame terms). A proper implementation would give these terms their own elements, and add them to the choices for the invisible "term" non-terminal. Because there is no invisible "term" non-terminal in XCL1, this method of extension is not available. In comparison, the invisible "sentence" non-terminal introduces extensibility, so extensions can increase the choices for "sentence" to include, say, weak negation, modal sentences, fuzzy sentences, ... To correct these and other defects, modifications including, but not limited to, the following are proposed: Elements to be deleted from the core XCL syntax without replacement - `` - `` - `` - `` - `` - `` - `` Attributes to be deleted from the core XCL syntax without replacement - `@dialect` (may be implemented in external extensions, but processors would need to validate embedded syntax somehow) - `@logicalFormOf` (may be implemented in external extensions) - `@syntaxType` (may be implemented in external extensions) - `@xml:id` ( removes the ambiguity of the module name, also allows external extensions to add an attribute of type `xs:ID` with the name and semantics of their choice, including the convention whereby the value is used to construct an IRI. Note: contrary to the statement in Annex C, it is never acceptable for a base IRI to be "arbitrary". If relative references are used, then the base IRI should be explicitly specified. Otherwise the text violates the Common Logic "means the same everywhere on the network" principle.) Elements that need to be split: - `` - `` Element to be added: - a root element - an element for sequence markers Content model changes - unless the abstract syntax is modified to allow multiple comments, restrict embedded comments to at most one per element, and to be not allowed on names except when they appear as terms - content model of functional terms modified to allow nullary functions - introduce a non-terminal for term sequence, to accommodate sequence markers - only one exclude element allowed per module - remove the erroneous `` choice from the "atom" non-terminal Additional errors in the DTD p. 54 There is inconsistency in the public identifiers declared for the DTD and the documentation. This DTD has the following formal public identifiers: ``` dtd "ISO/IEC 24707:2006//DTD XML Common Logic (XCL) 1.0//EN" "-//purl.org/xcl//DTD XML Common Logic (XCL) 1.0//EN" The DTD may be invoked by one of the following declarations: "xcl1.dtd"> ``` This allows any string for some attributes such as @href, but the attribute value should be limited to the lexical space of the IRI datatype. In Relax NG this would be corrected as follows: ``` dtd URI.datatype = xsd:anyURI ``` Summary: It would not be feasible to make even a modest subset of the proposed changes to the specification by piecemeal corrections of the published DTD. We propose that the entire Annex C be rewritten. Nevertheless, we feel that the proposed modifications to schema and syntax design can be made without being overly burdensome to existing users for the following reasons. - The first goal of the XCL syntax revision is to remove non-conformant constructs from the language. Existing implementations that are consistent with the CL abstract syntax should already have disallowed or ignored these. - Further, we propose adding some constructs that were omitted in XCL1 in order to bring XCL2 up to full conformance. If existing implementations do not want to take advantage of these new constructs, they can use a similar subset of the syntax that they are using now by making simple deletions from the schema.