Closed fcbr closed 5 years ago
@fcbr said:
From http://www-ksl.stanford.edu/knowledge-sharing/papers/kif.ps
"KIF originated in a Lisp application and inherits its syntax from Lisp. The relationship between linear KIF and structured KIF is most easily specfied by appeal to the Common Lisp reader. In particular a string of ascii characters forms a legal expression in linear KIF if and only if it is acceptable to the Common Lisp reader as defined in Steele's book and the structure produced by the Common Lisp reader is a legal expression of structured KIF as defined in the next section".
^ that to me indicates that KIF should be case-insensitive (as the default Lisp reader converts everything to upper case). If SUO-KIF is a variant of KIF, there's no reason why it should deviate from the same syntax convention...
But page 17 of https://github.com/ontologyportal/sigmakee/blob/master/suo-kif.pdf says:
SUO-KIF is intended as a language for knowledge authoring, unlike the original KIF, which was intended primarily as a language for knowledge interchange.
I believe our transformation must check for the crash of symbols when it runs in a case insensitive mode, in SBCL. If we are only losing legibility, it is acceptable for now and we can close #10. Later we can make the code more robust allowing the option to preserve the case of symbols.
The following list of axioms shows that we are not only losing legibility.
domainEnglishFormat.kif
1935 (termFormat EnglishLanguage Attorney "attorney")
Law.kif
480 (termFormat EnglishLanguage attorney "attorney")
----
domainEnglishFormat.kif
2436 (termFormat EnglishLanguage Broker "broker")
UXExperimentalTerms.kif
1887 (termFormat EnglishLanguage broker "broker")
---
domainEnglishFormat.kif
3123 (termFormat EnglishLanguage Composer "composer")
Music.kif
139 (termFormat EnglishLanguage composer "composer")
---
Music.kif
65 (termFormat EnglishLanguage Discography "discography")
232 (termFormat EnglishLanguage discography "discography")
---
domainEnglishFormat.kif
5723 (termFormat EnglishLanguage Judge "judge")
Law.kif
233 (termFormat EnglishLanguage judge "judge")
----
Music.kif
348 (termFormat EnglishLanguage musicGenre "music genre")
511 (termFormat EnglishLanguage MusicGenre "music genre")
---
domainEnglishFormat.kif
6988 (termFormat EnglishLanguage Musician "musician")
Music.kif
189 (termFormat EnglishLanguage musician "musician")
I generated this list modifying the read-kif
function to does not remove
duplicates and then I checked the duplicates using the get-duplicates
.
After that, I just compare the list of duplicates that mlisp
and sbcl
produce.
(defun read-kif (files)
(let ((res nil))
(dolist (file files)
(with-open-file (kb file)
(do ((st (read kb nil nil)
(read kb nil nil)))
((null st) res)
(push st res))))
res))
(defun get-duplicates (list &optional test)
(let ((ht (make-hash-table :test (or test #'equal)))
ret)
(dolist (x list)
(incf (gethash x ht 0)))
(maphash (lambda (key value)
(when (> value 1)
(push key ret)))
ht)
ret))
(get-duplicates (read-kif *sumo*))
Nicely done! Are (termFormat ...)
the only cases where this happen? If so, it may not affect the TPTP output in practice because it is one of the "ignored predicates": https://github.com/own-pt/cl-krr/blob/master/suo-kif.lisp#L19-L23
No @fcbr, @hmuniz used this list to further search for all occurrences of both versions of the symbols listed in the termFormat
axioms above. For instance, we found occurrences of Attorney
and attorney
in other axioms. Same happens for the other symbols listed above.
Ah, I spoke too soon. It looks like we do indeed might have problems:
(instance musicGenre BinaryPredicate)
(subclass MusicGenre RelationalAttribute)
So musicGenre
is a predicate, where MusicGenre
is a class.
So there are a couple of options that I can think of (just thinking out loud):
musicGenre
the predicate would be renamed to something like MUSICGENRE1
or MUSICGENREPREDICATE
and MusicGenre
would be renamed to MUSICGENRE2
or MUSICGENRECLASS
. musicGenrePredicate
, and MusicGenreClass
, for example.To solve this problem I used the first option combined with piping the required symbols to make the code to work properly.
SUO-KIF is a case sensitive language and thus we cannot have Lisp converting all symbols to upper case. Currently to avoid this problem we need to use
mlisp
, but a general solution is required.