Closed Habush closed 4 years ago
usage of NotLink in this context is correct?
It isn't; you want AbsentLink
instead. A short explanation: GetLink
has the form:
Get
<variable decls>
<clauses>
The clauses are a list of things that are meant to evaluate to true/false. The single most important clause is PresentLink
which says "everything wrapped by PresentLink must be present in the atomspace". It is so important that using it is optional! Almost everything you've ever written for MOZI has used only these implicit PresentLink
s. If you want to make sure something isn't there, use AbsentLink
. I think you can say (NotLink (PresentLink xyz))
and it will do the same thing as (Absent xyz)
If instead, you say (Not xyz)
it will assume that xyz is evaluatable, (generates a truth value) and it will take that truth value, invert it, and make a decision based off of that. It does NOT automatically wrap xyz with a PresentLink! Typically, you would use Not
with grounded predicate nodes, or other things you want to compute (greater-than, equal, etc.)
One last note: you did not need to use the AndLink
. If you just list two or more clauses after the variable decls, they are implicitly anded together -- that is, they must all be true -- they must all be "satisfied" for the pattern to match.
Oh, and after looking at your code, more carefully, there is another, completely unrelated bug in it. You wrote
(Not
(Evaluation (Predicate "from_organism")
(List
gene
(ConceptNode "TaxonomyID:2697049")
)))
This has no variables in it!!! Thus it is a "constant clause", and the pattern matcher ignores it. (because it is always the same, it doesn't specify a search, and thus can be ripped out of the list of clauses.) But even worse -- you are corrupting the dataset -- for every gen passed in, you are creating (placing in the atomspace)
(Evaluation (Predicate "from_organism")
(List
gene
(ConceptNode "TaxonomyID:2697049")
))
and so every gene will be from this taxonomy. Whoops! I think you wanted:
(Absent
(Evaluation (Predicate "from_organism")
(List
(Variable "$a")
(ConceptNode "TaxonomyID:2697049")
)))
Finally, instead of passing covid as a true-false, instead write:
(define-public (find-output-interactors gene do-protein namespace parents
coding non-coding not-organism)
...
(Get
...
(Absent
(Evaluation (Predicate "from_organism")
(List
(Variable "$a")
not-organism
)))
Multiple Absent links are allowed, so you could even allow a list of absent-organisms:
(define-public (find-output-interactors gene do-protein namespace parents
coding non-coding list-of-not-organisms)
...
(Get
...
(map (lambda (one-org)
(Absent
(Evaluation (Predicate "from_organism")
(List
(Variable "$a")
one-org
))))
list-of-not-organisms)
and so if you pass in the empty list, there aren't any AbsentLinks. (because map
of an empty list is empty). Thus, you avoid the torturous cascade of (if
statements.
One last note: you did not need to use the AndLink. If you just list two or more clauses after the variable decls, they are implicitly anded together -- that is, they must all be true -- they must all be "satisfied" for the pattern to match.
This didn't work for me for the clauses in find-output-interactors
. The following PM function is giving me an error.
(run-query (Get
(VariableList
(TypedVariable (Variable "$a") (Type 'GeneNode))
(TypedVariable (Variable "$b") (Type 'GeneNode)))
(Evaluation (Predicate "interacts_with")
(SetLink gene (Variable "$a")))
(Evaluation (Predicate "interacts_with")
(SetLink (Variable "$a") (Variable "$b")))
(Evaluation (Predicate "interacts_with")
(SetLink gene (Variable "$b")))
(map (lambda (org)
(Absent
(Evaluation (Predicate "from_organism")
(List
(Variable "$a")
(ConceptNode (string-append "TaxonomyID:" org))
)
)
)
) orgs)
))
The error:
(C++-EXCEPTION "cog-new-link" "Expecting (optional) variable decls and a body; got (GetLink\n (VariableList\n (TypedVariableLink\n (VariableNode \"$a\") ; [6bd8475ba49b36bd][1]\n (TypeNode \"GeneNode\") ; [72afdbafe12c5cff][1]\n ) ; [da13d6fe6ebcfbcc][1]\n (TypedVariableLink\n (VariableNode \"$b\") ; [6968a478572d06e7][1]\n (TypeNode \"GeneNode\") ; [72afdbafe12c5cff][1]\n ) ; [aaab32f9debad47c][1]\n ) ; [f25b1ddc676be512][1]\n (EvaluationLink\n (PredicateNode \"interacts_with\") ; [3c90bc062a581882][1]\n (SetLink\n (GeneNode \"IGF1\") ; [37b88ff182564bf8][1]\n (VariableNode \"$a\") ; [6bd8475ba49b36bd][1]\n ) ; [931eabcc0da9970d][1]\n ) ; [8cef2465d2454afe][1]\n (EvaluationLink\n (PredicateNode \"interacts_with\") ; [3c90bc062a581882][1]\n (SetLink\n (VariableNode \"$b\") ; [6968a478572d06e7][1]\n (VariableNode \"$a\") ; [6bd8475ba49b36bd][1]\n ) ; [9772dd141fcd79e1][1]\n ) ; [c416e4709baaa262][1]\n (EvaluationLink\n (PredicateNode \"interacts_with\") ; [3c90bc062a581882][1]\n (SetLink\n (GeneNode \"IGF1\") ; [37b88ff182564bf8][1]\n (VariableNode \"$b\") ; [6968a478572d06e7][1]\n ) ; [dd8135591f06e4e8][1]\n ) ; [e45b9a76e3cd312b][1]\n) ; [bb4339611ea61d7b][-1] (/tmp/atomspace-master/opencog/atoms/pattern/PatternLink.cc:154)\nFunction args:\n((VariableList\n (TypedVariableLink\n (VariableNode \"$a\")\n (TypeNode \"GeneNode\")\n )\n (TypedVariableLink\n (VariableNode \"$b\")\n (TypeNode \"GeneNode\")\n )\n)\n (EvaluationLink\n (PredicateNode \"interacts_with\")\n (SetLink\n (GeneNode \"IGF1\")\n (VariableNode \"$a\")\n )\n)\n (EvaluationLink\n (PredicateNode \"interacts_with\")\n (SetLink\n (VariableNode \"$b\")\n (VariableNode \"$a\")\n )\n)\n (EvaluationLink\n (PredicateNode \"interacts_with\")\n (SetLink\n (GeneNode \"IGF1\")\n (VariableNode \"$b\")\n )\n)\n ())")
Note: orgs
is a list of Taxonomy IDs (which are strings).
This didn't work for me
Huh. That's odd. It should have, but maybe that's only for BindLink. I will double-check and fix if needed. In the meanwhile, there is nothing wrong with using the AndLink.
Minor notes: you may want to rename orgs
to exclude-orgs
and taxonomies
to exclude-taxonoimies
this makes the code clearer.
You may want to explicitly use PresentLink
from now on. Its not really needed ... but it could make the patterns slightly easier to read.
FYI: you should know that there exists something called AlwaysLink
which is like PresentLink
, except that it requires the clause to be present in each and every result: it must "always" be present. If it's not, the search will fail (no results will be returned). If this sounds interesting but confusing, take a look at the atomspace examples directory, at the always.scm
example.
This didn't work for me
Huh. That's odd.
OK, I was wrong/mis-remembering things. What I was remembering incorrectly was this: you can write BindLinks in the form:
BindLink
<var-decls>
<body-to-match>
<rewrite-1>
...
<rewrite-n>
That is, you can trigger multiple rewrite rules in one pass. You don't have to wrap them in a ListLink
or anything like that.
... and one last minor remark: if you have only one variable, you do not need to put it in a VariableList
. Again, this can make things slightly easier to read, in a few cases.
Okay. I will update the code with your suggestions tomorrow.
@linas can you please check?
This PR adds a filter that specifies whether to include COVID-19 genes in the biogrid annotation. @linas can you please check if my usage of
NotLink
in this context is correct?