twosixlabs / icas-ontology

The unified ICAS ontology designed to describe information-security related information
28 stars 17 forks source link

Would there be a benefit to adding keywords to the ontology #5

Open cwacekINV opened 9 years ago

cwacekINV commented 9 years ago

I had a conversation with Dan Olsher at DO about generating keywords for elements of the ontology, and noted that we had already explored doing that, with mixed results. A sampling of some of the keywords we were able to generate for datatype properties is included below. We determined that this was not effective enough without massive manual curation to include directly into the ontology.

{
    "http://www.invincea.com/ontologies/icas/1.0/acl#hasOrder": [
        "control",
        "within",
        "postion",
        "list",
        "number",
        "access",
        "position",
        "entry",
        "order"
    ],
    "http://www.invincea.com/ontologies/icas/1.0/acl#isRecursive": [
        "recursive",
        "grant",
        "recursion",
        "recursively",
        "object",
        "iterative",
        "acl",
        "objects",
        "listed",
        "child",
        "entries",
        "permissions"
    ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#authStatus": [
        "status",
        "success",
        "etc",
        "could",
        "auth",
        "failure",
        "authentication",
        "inprogress",
        "event"
    ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#loginName": [
        "account",
        "via",
        "name",
        "specific",
        "used",
        "identify",
        "keyboard",
        "input",
        "login",
        "string"
    ]
}
cwacekINV commented 9 years ago

Here's an example of some keywords we sourced using stack overflow (with scores):

{
    "http://www.invincea.com/ontologies/icas/1.0/acl#isRecursive": [ [ "nonrecursive", 0.7295090876893041 ], [ "recursively", 0.6426215854350211 ], [ "recursion", 0.6051441101541714 ], [ "closure", 0.6008478541261637 ], [ "depthfirst", 0.5915859774162056 ], [ "linqlike", 0.5882240895483247 ], [ "nontail", 0.5787895768031768 ], [ "tailrecursive", 0.5715128309970288 ], [ "bfs", 0.5646340785671211 ], [ "memoize", 0.5582108036513455 ] ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#authStatus": [ [ "inprogress", 0.527129777815559 ], [ "thisstatus", 0.5168390846249955 ], [ "tempstatus", 0.5021681552141701 ], [ "progress", 0.49858897355167664 ], [ "ifstatus", 0.4943460705706234 ], [ "fullclasssplit", 0.4933231927694189 ], [ "successfailure", 0.49327649102068655 ], [ "auth", 0.4902656257118948 ], [ "authentification", 0.4853285179216026 ], [ "bbpluginjad", 0.4843660204071121 ] ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#loginName": [ [ "fullyqualified", 0.5323924586138464 ], [ "fullname", 0.516713805478136 ], [ "displayname", 0.504951959441364 ], [ "surname", 0.48377735692393825 ], [ "tblxfirst", 0.4823843065232134 ], [ "names", 0.47685891780938294 ], [ "recordcontact", 0.4755596467421599 ], [ "distinguished", 0.4752969834714968 ], [ "age", 0.4734373151068471 ], [ "readhost", 0.47313880949569287 ] ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#loginPass": [ [ "username", 0.6598800147228843 ], [ "logout", 0.6052991313537446 ], [ "pasword", 0.5994572980488897 ], [ "userpassword", 0.5923182710965996 ], [ "logon", 0.5853781992678814 ], [ "password1", 0.5818161613639692 ], [ "userpass", 0.5814719352358263 ], [ "loginpassword", 0.5662747196640523 ], [ "rootpassword", 0.565575323914818 ], [ "signin", 0.5478209678539595 ] ],
    "http://www.invincea.com/ontologies/icas/1.0/authentication#sessionId": [ [ "sessions", 0.6471611242429031 ], [ "cookie", 0.617171460412521 ], [ "httpsession", 0.5576997877336048 ], [ "requestgetsessiontrue", 0.5495014059819578 ], [ "nonunique", 0.5482390339945872 ], [ "login", 0.5449650291920627 ], [ "aspnetsessionid", 0.5425812786171548 ], [ "sessioninvalidate", 0.5405081014766302 ], [ "sessionname", 0.5398087822352003 ], [ "cookies", 0.5350782750080436 ] ],
    "http://www.invincea.com/ontologies/icas/1.0/controls#ruleID": [ [ "rules", 0.7511101260456363 ], [ "namerewrite", 0.6522362762600327 ], [ "namereverseproxyinboundrule1", 0.6335301672022533 ], [ "thumb", 0.6315910634888391 ], [ "namewordpress", 0.6112576529545717 ], [ "outboundrules", 0.6105088258298639 ], [ "catchall", 0.6103948966936892 ], [ "stopprocessingtrue", 0.6082019036131368 ], [ "urlhttpshttphostrequesturi", 0.6076795829322585 ], [ "urlrejectedbyurlscan", 0.605313779006146 ] ]}
cwacekINV commented 9 years ago

@danielolsher-digitaloperatives Here are a bunch of keywords for datatype properties that we generated from Stack Overflow (which incidentally introduces a bunch of spelling error and such). https://gist.github.com/cwacekINV/6b858e76b1f97d8d2741

You mentioned that you had come up with a pretty effective algorithm for generating keywords. What was your approach?