Closed Habush closed 4 years ago
This has been merged quite some time ago. Could you please add this case to the tests?
FWIW, I see 4 test failures right now, all of them in test/main-tests.scm where hash tables are unexpectedly returned.
There we zero test failures when my code was merged.
I cannot reproduce this. There's an earlier error:
scheme@(guile-user)> ,use(opencog)
scheme@(guile-user)> ,use (opencog bioscience)
scheme@(guile-user)> (primitive-load "tests/sample_dataset.scm")
$1 = (EvaluationLink
(PredicateNode "has_pubmedID")
(ListLink
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink
(GeneNode "ALDOA")
(GeneNode "ACTN4")
)
)
(ListLink
(ConceptNode "https://www.ncbi.nlm.nih.gov/pubmed/?term=26344197")
)
)
)
scheme@(guile-user)> ,use (annotation main)
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /gnu/store/v59226wpnd7q22i1x8491b0hhj0zny58-profile/share/guile/site/2.2/opencog/exec.scm
;;; compiled /home/rekado/.cache/guile/ccache/2.2-LE-8-3.A/gnu/store/69ah320q8sbivnzxnf8sim9wv1g10v4k-atomspace-5.0.3-1.86c848d/share/guile/site/2.2/opencog/exec.scm.go
scheme@(guile-user)> (annotate-genes (list "IGF1") "test-res" "[{\"function_name\": \"gene-pathway-annotation\", \"filters\": [{\"filter\": \"pathway\", \"value\": \"smpdb reactome\"},{\"filter\": \"include_prot\", \"value\": \"True\"}, {\"filter\": \"include_sm\", \"value\": \"False\"},{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"}, {\"filter\": \"biogrid\", \"value\": \"0\"}]}, {\"function_name\": \"gene-go-annotation\", \"filters\": [{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}, {\"filter\": \"parents\", \"value\": \"0\"}, {\"filter\": \"protein\", \"value\": \"True\"}]}, {\"function_name\": \"include-rna\", \"filters\": [{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"},{\"filter\": \"protein\", \"value\": \"1\"}]},{\"function_name\": \"biogrid-interaction-annotation\", \"filters\": [{\"filter\": \"interaction\", \"value\": \"Proteins\"}]}]")
ERROR: In procedure scm-error:
vector-map: expected vector, got (#<hash-table 7ff9da35da60 2/31> #<hash-table 7ff9da3651a0 2/31> #<hash-table 7ff9da36ca00 2/31> #<hash-table 7ff9da374360 2/31>)
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> ,q
scheme@(guile-user)>
I don't know if I did this right. Would be better to have this as an executable test.
@rekado Are you using the upstream guile-json
library?
yes, of course. I'm doing this in a Guix environment.
It is the only reason that I can think of. B/c the earlier version of guile-json uses hash-table
and the latest one uses vector
.
ah, you switched from version 1.x to the latest version. The guix.scm
file still refers to version 1.
Same for the Dockerfile. It still checks out version 1.2 of Guile JSON.
Still can't reproduce this:
$ [env] ./env guile
./env: line 2: cd: /opt: No such file or directory
./env: line 3: cd: /opt: No such file or directory
GNU Guile 2.2.6
Copyright (C) 1995-2019 Free Software Foundation, Inc.
Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.
Enter `,help' for help.
scheme@(guile-user)> ,use (annotation main)
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /gnu/store/imax7z47mbbf6vsfcbwmq5qrw5x7y4s3-profile/share/guile/site/2.2/opencog/exec.scm
;;; compiled /home/rekado/.cache/guile/ccache/2.2-LE-8-3.A/gnu/store/j32mq6w5q99mlmpldx4jh9l92hbkjwdb-atomspace-5.0.3-1.86c848d/share/guile/site/2.2/opencog/exec.scm.go
;;; note: source file /home/rekado/dev/annotation-scheme/annotation/gene-go.scm
;;; newer than compiled /home/rekado/dev/annotation-scheme/annotation/gene-go.go
;;; compiling /home/rekado/dev/annotation-scheme/annotation/gene-go.scm
;;; note: source file /home/rekado/dev/annotation-scheme/annotation/parser.scm
;;; newer than compiled /home/rekado/dev/annotation-scheme/annotation/parser.go
;;; compiling /home/rekado/dev/annotation-scheme/annotation/parser.scm
;;; compiled /home/rekado/.cache/guile/ccache/2.2-LE-8-3.A/home/rekado/dev/annotation-scheme/annotation/parser.scm.go
;;; compiled /home/rekado/.cache/guile/ccache/2.2-LE-8-3.A/home/rekado/dev/annotation-scheme/annotation/gene-go.scm.go
scheme@(guile-user)> ,use (opencog) (opencog bioscience)
scheme@(guile-user)> (primitive-load "tests/sample_dataset.scm")
$1 = (EvaluationLink
(PredicateNode "has_pubmedID")
(ListLink
(EvaluationLink
(PredicateNode "interacts_with")
(ListLink
(GeneNode "ALDOA")
(GeneNode "ACTN4")
)
)
(ListLink
(ConceptNode "https://www.ncbi.nlm.nih.gov/pubmed/?term=26344197")
)
)
)
scheme@(guile-user)> (annotate-genes (list "IGF1") "test-res" "[{\"function_name\": \"gene-pathway-annotation\", \"filters\": [{\"filter\": \"pathway\", \"value\": \"smpdb reactome\"},{\"filter\": \"include_prot\", \"value\": \"True\"}, {\"filter\": \"include_sm\", \"value\": \"False\"},{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"}, {\"filter\": \"biogrid\", \"value\": \"0\"}]}, {\"function_name\": \"gene-go-annotation\", \"filters\": [{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}, {\"filter\": \"parents\", \"value\": \"0\"}, {\"filter\": \"protein\", \"value\": \"True\"}]}, {\"function_name\": \"include-rna\", \"filters\": [{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"},{\"filter\": \"protein\", \"value\": \"1\"}]},{\"function_name\": \"biogrid-interaction-annotation\", \"filters\": [{\"filter\": \"interaction\", \"value\": \"Proteins\"}]}]")
annotation/main.scm:95:22: In procedure variable-ref: Not a variable: #f
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> ,q
scheme@(guile-user)>
Please add executable tests for this. We won't get far like this.
@rekado please import this modules first,
(use-modules
(opencog) (opencog exec)
(annotation) (annotation main)
(annotation gene-go) (annotation biogrid) (annotation gene-pathway)
(annotation functions) (annotation util) (annotation rna)
(annotation parser)
)
I want fix the module-loading issue but currently I don't how to search for a symbol in a specific module so that's why I am resorting to current-module
Btw when running annotate-genes
it prints the atomese result into stdout before returning the JSON. And I can't seem to find where it is coming from
I can prepare a patch for looking up variables only in the specified module. If you want to take a stab at it yourself, please take a look at 6.20.8 Module System Reflection in the manual. resolve-interface
is a good starting point.
okay, thanks for the direction. I will work on PR and will ask you to review it.
I still can't reproduce this, but I do get a different kind of error. When I run this, the unknown
case of the expr->graph
procedure is triggered --- and then it spits out the unmatched expression.
The unmatched expression is a plain list of ListLink
elements. The parser does not support that. It expects the data structure to start with one of the Atomspace types such as Predicate
, Member
, Evaluation
, List
, ListLink
, etc.
So this would be okay:
'(List (ListLink ...) (ListLink ...))
But we get this:
'((ListLink ...) (ListLink ...) ...)
Why do we get a list of atomspace expressions and not just one atomspace expression?
Btw when running
annotate-genes
it prints the atomese result into stdout before returning the JSON. And I can't seem to find where it is coming from
This is the pk
call in the unknown
case, which has a comment above it that says: "This shouldn't happen".
It spits out the unmatched expression as a courtesy to the debugging person :) (I'll make it pretty print this in a PR.)
The reason it returns '((ListLink ...) (ListLink ...) ...)
is because the output comes from running map
on the request.
So the parser should handle this case.
I tested it with the following input and it still fails:
(ListLink
(ConceptNode "main")
(ListLink
(EvaluationLink
(PredicateNode "has_name")
(ListLink
(GeneNode "IGF1")
(ConceptNode "Insulin-like growth factor I")
)
)
(ListLink
(EvaluationLink
(PredicateNode "has_location")
(ListLink
(GeneNode "IGF1")
(ConceptNode "exocytic vesicle")
)
)
(EvaluationLink (stv 1 1)
(PredicateNode "has_location")
(ListLink
(GeneNode "IGF1")
(ConceptNode "platelet alpha granule lumen")
)
)
)
)
)
How can I reproduce this? To what procedure is this the input?
The reason it returns
'((ListLink ...) (ListLink ...) ...)
is because the output comes from runningmap
on the request.So the parser should handle this case.
Then you should also map
the parser. Is it known when the result is a single atomspace expression and when it is a list of atomspace expressions?
How can I reproduce this? To what procedure is this the input?
scheme@(guile-user)> (define input (ListLink
(ConceptNode "main")
(ListLink
(EvaluationLink
(PredicateNode "has_name")
(ListLink
(GeneNode "IGF1")
(ConceptNode "Insulin-like growth factor I")
)
)
(ListLink
(EvaluationLink
(PredicateNode "has_location")
(ListLink
(GeneNode "IGF1")
(ConceptNode "exocytic vesicle")
)
)
(EvaluationLink (stv 1 1)
(PredicateNode "has_location")
(ListLink
(GeneNode "IGF1")
(ConceptNode "platelet alpha granule lumen")
)
)
)
)
scheme@(guile-user)> (atomese-parser input)
atomese-parser
currently takes a string as its input, because it hasn't been changed to read directly from a port yet.
So it needs to be (atomese-parser (format #f "~a" input))
.
Oops, I should have mentioned that I have removed the string conversion and now the atomese-parser
accepts s-expr. This change is local and I haven't pushed it yet.
I think the pattern matching doesn't ignore whitespaces and newlines.
Can you show your diff?
Please check #171
After the latest improvements to the parser by @rekado, the
annotate-genes
function is returning empty list of nodes and edges. To reproduce this issue, load thesample_dataset.scm
and run the annotation forIGF1
as follows:The above returns the following json: