Closed Habush closed 4 years ago
When I try to run the above, I immediately get the error:
ice-9/boot-9.scm:1655:16: In procedure raise-exception:
In procedure variable-ref: Wrong type argument in position 1 (expecting box): #f
Reading the code, in main,.scm line 95, I see
(func (variable-ref (module-variable (current-module) func-name)))
suggesting that module-variable
returned #f
presumably because func-name
is the empty string. .. but its not. Printing it out, I get gene-pathway-annotation
so I manually check:
(module-variable (current-module) 'gene-pathway-annotation)
$4 = #f
so then I manually say (use-modules (annotation gene-pathway))
and then it pukes on gene-go-annotation
... and down the line.
It appears that the code is failing to include the modules that it needs to resolve all of these symbols.
My manual work-around is to manually load all 10 annotation
modules.
This highlights what appears to be an unrelated bug: there's no need to have 10 different modules -- you could just have one, called annotation
-- this would make it easier for users to use, and for developers to debug.
Re: SetLink
-- are you doing anything to delete SetLinks (besides deleting the SetLinks in run-query
? Are you using child atomspaces? Yes, the caching will fail if the cache entry is deleted for some reason...
OK, after the above work-arounds (i.e. of manually loading all the needed annotation modules), it completes without error for me (in less than a second).
Please note that I am using the december 2019 version of the datasets; perhaps this bug depends on the datasets? Do I need to get newer datasets? (from where?)
The issue was with the version of datasets I was using. Now it is working.
Btw,
OK, after the above work-arounds (i.e. of manually loading all the needed annotation modules),
I think this should be improved and I will work on a PR. do you have suggestions for better ways of loading specific modules instead of the current-module
? may be @rekado can answer this too
Uh, wait..I am trying to annotate the 800+ genes and after few minutes it crashed with the same error. Here is the annotation function I'm calling: annotation.txt
EDIT: 681 genes not 800+
modules
I'm just suggesting that, instead of saying (define-module ...)
in each distinct file, that instead, just create one file: annotation.scm
and then just say
(define-module (annotation))
(include-file "util.scm")
(include-file "function.scm")
(include-file "biogrid.scm")
(include-file "main.scm")
... etc...
OK, yes it crashed after 45 minutes. Stack trace:
ice-9/boot-9.scm:1655:16: In procedure raise-exception:
In procedure cog-outgoing-set: Wrong type argument in position 1 (expecting opencog atom): #<Invalid handle>
scheme@(guile-user) [1]> ,bt
In annotation/main.scm:
119:19 2 (annotate-genes _ _ _)
In ice-9/threads.scm:
289:22 1 (loop _)
In ice-9/boot-9.scm:
1655:16 0 (raise-exception _ #:continuable? _)
line 119 of main.scm is [result (par-map (lambda (x) (x)) fns)]
My experience with guile parallel routines is that they are buggy. I don't know why. In this partocular case, I was seeing guile run at 120% CPU, so it got almost no speedup at all (which is very typical). It is often the case that guile parallel routines run slower than single-threaded. There's some lock contention somewhere.
Here's the full stack trace from opencog.log
[2020-03-03 22:45:58:337] [ERROR] Backtrace:
In ice-9/threads.scm:
288:21 19 (loop _)
In annotation/biogrid.scm:
46:10 18 (biogrid-interaction-annotation _ "agingSymbols" #:interaction _ # _ …)
In srfi/srfi-1.scm:
673:15 17 (append-map _ _ . _)
586:29 16 (map1 _)
586:29 15 (map1 _)
586:17 14 (map1 ("RBM5" "SLC7A2" "NDUFAB1" "DVL2" "SKAP2" "DHX33" "MSL3" "…" …))
In annotation/biogrid.scm:
50:35 13 (_ _)
In annotation/util.scm:
188:8 12 (run-query _)
In unknown file:
11 (opencog-extension cog-execute! (#))
In ice-9/boot-9.scm:
1722:10 10 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In unknown file:
9 (apply-smob/0 #<thunk 7fcd3da69e40>)
In ice-9/boot-9.scm:
1722:10 8 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In unknown file:
7 (apply-smob/0 #<thunk 7fcd3da69de0>)
In annotation/functions.scm:
804:8 6 (generate-result (GeneNode "XRCC3")
(GeneNode "SNAI1")
_ # #)
In opencog/base/atom-cache.scm:
54:34 5 (_ (SetLink
(GeneNode "SNAI1")
(GeneNode "XRCC3")
)
)
In annotation/functions.scm:
772:11 4 (do-find-pubmed-id _)
In annotation/util.scm:
190:8 3 (run-query _)
In unknown file:
2 (cog-outgoing-set #<Invalid handle>)
In ice-9/boot-9.scm:
1655:16 1 (raise-exception _ #:continuable? _)
In unknown file:
0 (apply-smob/1 #<exception-handler 7fcd3da69dc0> #<&compound-excepti…>)
ERROR: In procedure apply-smob/1:
In procedure cog-outgoing-set: Wrong type argument in position 1 (expecting opencog
atom): #<Invalid handle>
ABORT: wrong-type-arg
(/home/ubuntu/src/atomspace/opencog/guile/SchemeEval.cc:1067)
Stack Trace:
2: basic_string.h:211 ??()
3: Logger.cc:589 opencog::Logger::Error::operator()(char const*, ...)
4: exceptions.cc:55 opencog::StandardException::parse_error_message(char const*, __va_list_tag*, bool)
5: exceptions.cc:82 opencog::StandardException::parse_error_message(char const*, char const*, __va_list_tag*, bool)
6: exceptions.cc:142 opencog::RuntimeException::RuntimeException(char const*, char const*, ...)
7: SchemeEval.cc:1067 opencog::SchemeEval::apply_v(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, opencog::Handle)
8: shared_ptr_base.h:712 ??()
9: shared_ptr_base.h:683 ??()
10: shared_ptr_base.h:975 ??()
11: Instantiator.cc:622 opencog::Instantiator::instantiate(opencog::Handle
const&, std::map<opencog::Handle, opencog::Handle, std::less<opencog::Handle>, std:
:allocator<std::pair<opencog::Handle const, opencog::Handle> > > const&, bool)
12: Implicator.cc:59 opencog::Implicator::grounding(std::map<opencog::Handle, opencog::Handle, std::less<opencog::Handle>, std::allocator<std::pair<opencog::Handle const, opencog::Handle> > > const&, std::map<opencog::Handle, opencog::Handle, std::less<opencog::Handle>, std::allocator<std::pair<opencog::Handle const, opencog::Handle> > > const&)
13: PatternMatchEngine.cc:2403 opencog::PatternMatchEngine::report_grounding(std::map<opencog::Handle, opencog::Handle, std::less<opencog::Handle>, std::allocator<std::pair<opencog::Handle const, opencog::Handle> > > const&, std::map<opencog::Handle, opencog::Handle, std::less<opencog::Handle>, std::allocator<std::pair<opencog::Handle const, opencog::Handle> > > const&)
14: PatternMatchEngine.cc:1904 opencog::PatternMatchEngine::do_next_clause()
15: PatternMatchEngine.cc:1888 opencog::PatternMatchEngine::clause_accept(opencog::Handle const&, opencog::Handle const&)
16: PatternMatchEngine.cc:1833 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
17: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_singl
e_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, openc
og::Handle const&)
18: PatternMatchEngine.cc:1330 opencog::PatternMatchEngine::explore_upvar_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
19: PatternMatchEngine.cc:1260 opencog::PatternMatchEngine::explore_up_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
20: PatternMatchEngine.cc:1805 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
21: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_single_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
22: PatternMatchEngine.cc:1330 opencog::PatternMatchEngine::explore_upvar_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
23: PatternMatchEngine.cc:1260 opencog::PatternMatchEngine::explore_up_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
24: PatternMatchEngine.cc:1805 opencog::PatternMatchEngine::do_term_up(st
d::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle
const&)
25: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_single_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
26: PatternMatchEngine.cc:1223 opencog::PatternMatchEngine::explore_term_branches(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
27: PatternMatchEngine.cc:2509 opencog::PatternMatchEngine::explore_clause_direct(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
28: PatternMatchEngine.cc:2622 opencog::PatternMatchEngine::explore_clause(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
29: PatternMatchEngine.cc:1946 opencog::PatternMatchEngine::do_next_clause()
30: PatternMatchEngine.cc:1888 opencog::PatternMatchEngine::clause_accept(opencog::Handle const&, opencog::Handle const&)
31: PatternMatchEngine.cc:1833 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
32: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_single_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
33: PatternMatchEngine.cc:1330 opencog::PatternMatchEngine::explore_upvar
_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
34: PatternMatchEngine.cc:1260 opencog::PatternMatchEngine::explore_up_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
35: PatternMatchEngine.cc:1805 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
36: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_single_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
37: PatternMatchEngine.cc:1357 opencog::PatternMatchEngine::explore_upvar_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
38: PatternMatchEngine.cc:1260 opencog::PatternMatchEngine::explore_up_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
39: PatternMatchEngine.cc:1805 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
40: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_singl
e_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, openc
og::Handle const&)
41: PatternMatchEngine.cc:1223 opencog::PatternMatchEngine::explore_term_branches(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
42: PatternMatchEngine.cc:2509 opencog::PatternMatchEngine::explore_clause_direct(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
43: PatternMatchEngine.cc:2622 opencog::PatternMatchEngine::explore_clause(opencog::Handle const&, opencog::Handle const&, opencog::Handle const&)
44: PatternMatchEngine.cc:1946 opencog::PatternMatchEngine::do_next_clause()
45: PatternMatchEngine.cc:1888 opencog::PatternMatchEngine::clause_accept(opencog::Handle const&, opencog::Handle const&)
46: PatternMatchEngine.cc:1833 opencog::PatternMatchEngine::do_term_up(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
47: PatternMatchEngine.cc:1668 opencog::PatternMatchEngine::explore_single_branch(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
48: PatternMatchEngine.cc:1330 opencog::PatternMatchEngine::explore_upvar_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
49: PatternMatchEngine.cc:1260 opencog::PatternMatchEngine::explore_up_branches(std::shared_ptr<opencog::PatternTerm> const&, opencog::Handle const&, opencog::Handle const&)
Without the par-map, it ran for a while, then crashed with
GC Warning: Repeated allocation of very large block (appr. size 4227862528):
May lead to memory leak and poor performance
Aborted
That's a 4-gigabyte block .... I'm assuming that this is json-related, because, without json, total RAM usage never got above 5GB, and guile usage never got above 200MBytes .. but I dunno, will try again, shortly.
I removed the par-map
and the json code and it crashed with the following error after 37 minutes:
ERROR: In procedure concatenate:
In procedure append: Wrong type argument in position 1 (expecting empty list)
I think this is b/c I replaced par-map
with append-map
. But it looks like it has finished the annotation as I see the scheme files are written and from their size (e.g the pathway output file is 1.8GB) I guess it did the whole annotation
Replace it with map
instead. append-map
expects the procedure to return a list for each element, so that all the result lists can be appended in the end.
I conclude that the caching/memoization functions are not thread-safe; I will look at them and try to fix shortly.
Re: map
and file-writing: I would strongly recommend that this be done in some incremental fashion. i.e. rather than calling append-map
in 101 different places in the code, and then passing around large return values, instead, create a function called report-result
and call it whenever a result is obtained. From here, you could then either write to a file, or run it through json, or run it into some other database or API, or whatever. .. so that report-result is a callback that is called whenever some annotation is completed.
FWIW, last time I looked, about 20% of total CPU time was spent writing the file.
OK, if I stub out the json-printing at the end, everything completes. The file generation looks like so (gc-stats)
(heap-size . 7639584768) (heap-free-size . 7435337728)
diff=200MBytes(format #f "~a" results)
takes 270 seconds, blows up RAM: (heap-size . 28142227456) (heap-free-size . 19253534720)
so guile is now using 10GB more, and I got error: GC Warning: Repeated allocation of very large block (appr. size 1056968704):
also RSS blow up to 31.1GBytes from 9.4GBytes. atomese-parser
on the above string results in the abort. I forgot to measure the length of the string ...OK, after making numerous changes to the code, trying to improve performance (I've got it running maybe 4x faster? now) I did hit this, rather unexpectedly:
# …))
586:29 8 (map1 ((ListLink
(GeneNode "CDR2L")
(GeneNode "ELAVL1")
)
…))
586:17 7 (map1 ((ListLink
(GeneNode "JUP")
(GeneNode "DBN1")
)
# # …))
In annotation/functions.scm:
632:2 6 (_ _)
893:8 5 (xgenerate-result (GeneNode "JUP")
(GeneNode "DBN1")
#t ("") 0 #f …)
In opencog/base/atom-cache.scm:
55:32 4 (_ (SetLink
(GeneNode "JUP")
(GeneNode "DBN1")
)
)
In unknown file:
3 (hashx-set! #<procedure atom-hash (ATOM SZ)> #<procedure atom-ass…> …)
In opencog/base/atom-cache.scm:
47:44 2 (atom-hash #<Invalid handle> 224717)
In unknown file:
1 (cog-handle #<Invalid handle>)
In ice-9/boot-9.scm:
1655:16 0 (raise-exception _ #:continuable? _)
scheme@(guile-user) [1]>
what's weird about the above is that more than 1200 of these map1
things were printed, scrolling off the top of the screen. Typing in ,up
indicates that each one is a stack frame. I'm flabbergasted that the guile stack is over 1200 frames deep at this point. I'm quite confused about this... this implies out-of-control recursion Where's that recursion? Why isn't it tail-recursive?
Maybe this is how srfi-1 map
actually works? There were 2861 frames, all but the last few were map1
and looked exactly alike (except for different gene names in them)...
... Above crash happens in exactly the same place each time, exactly the same stack trace. Five out of five tries.
@rekado so -- any clue about guile's implementation of srfi-1 map
? The above stack trace makes it look like its not tail recursive. Maybe I'm misdreading, or maybe it doesn't matter, or maybe there's actually just some bizarre stale bug in guile's version of srfi-1 ???
* Then `(format #f "~a" results)` takes 270 seconds, blows up RAM: `(heap-size . 28142227456) (heap-free-size . 19253534720)` so guile is now using 10GB more, and I got error: `GC Warning: Repeated allocation of very large block (appr. size 1056968704):` also RSS blow up to 31.1GBytes from 9.4GBytes.
Yes, the format call is not great. The new atomese-parser would work fine on a port, but the code converts things to strings and then reads from the string again, which is unnecessary. The code should use the port interface directly.
Guile's version of srfi-1 is a bit older, but even the latest version of the reference implementation of srfi-1 piles up cons
es because the second argument to the cons
is the recursive call. So the latest reference implementation is not tail recursive. Neither is the implementation in Guile (no matter if that's the one from boot-9 or srfi-1).
I think that's because map
is actually map-in-order
. I don't see how one could write a tail-recursive map
that processes its arguments from left to right. I can only think of one that reverses the result list.
I'm not sure of what the goal is here because I haven't been following the discussion closely. Is a map
really needed here? Or can we fold instead?
FWIW, the implementation of map
in (rnrs base (6))
does appear to be tail recursive (the definition is a bit hard to read). It uses reverse
at the end to restore order. So maybe that's another thing to try -- if processing order doesn't matter.
The root cause of the assert is this: the cache contains the atom (Set (GeneNode "HEXIM1") (GeneNode "HNRNPL"))
but this atom just happens to be the same as the return of an earlier query, and has been deleted. Thus, the hash table is hanging on to a bad reference, and throws the exception when accessed. I am pondering a good fix right now.
@rekado for this code, the order of the map
doesn't matter. But also (for this code) the amount of CPU-time in map
is miniscule, so chainging/fixing this is not important. I was just ... surprised ... given how SICP and other texts explain tail recursion in chapter-one, that something as basic as srfi-1 map would not be tail-recursive. Interesting; thanks for the notes.
@Habush this is fixed in opencog/atomspace#2525
@linas does this also fix the the thread-safety issue of the caches?
does this also fix the the thread-safety issue of the caches?
No ... but I have no clear evidence that the caches are not thread-safe. I assumed that was the problem, at first, but don't really know if that is a problem. The error message #<invalid-handle>
is completely explained by it being a deleted Atom, whereas, if the guile hash functions weren't thread-safe, you'd see corruption or crashes of some kind. So, for now, I don't think thread-safety is an issue...
p.s. regarding threading: it provides very little speedup; the huge impact on perf is from caching assorted search results, viz. the other open pull reqs.
oh. Hmm. Just ran it with par-map, and hit #<Invalid handle>
w/
In annotation/main.scm:
72:42 4 (_ "NME1-NME2")
In opencog/base/atom-cache.scm:
63:34 3 (_ (GeneNode "NME1-NME2")
)
In annotation/util.scm:
190:8 2 (do-locate-node _)
In unknown file:
1 (cog-outgoing-set #<Invalid handle>)
In ice-9/boot-9.scm:
1655:16 0 (raise-exception _ #:continuable? _)
and again, in a different place:
In annotation/main.scm:
71:32 7 (_ "RABL5")
In annotation/util.scm:
82:14 6 (node-info _)
In opencog/base/atom-cache.scm:
63:34 5 (_ (GeneNode "RABL5")
)
In annotation/util.scm:
63:27 4 (do-get-node-info (GeneNode "RABL5")
)
In opencog/base/atom-cache.scm:
63:34 3 (_ (GeneNode "RABL5")
)
In annotation/util.scm:
175:8 2 (do-find-pathway-name _)
In unknown file:
1 (cog-outgoing-set #<Invalid handle>)
In ice-9/boot-9.scm:
1655:16 0 (raise-exception _ #:continuable? _)
so maybe there is a threading race, after all. Will examine tomorrow.
Using par-map
in main.scm
, I saw at least four exceptions of the form above, each for different GeneNodes, thus suggesting a race condition. I do not understand why the catch
handler did not catch the exceptions. I added a mutex to afunc
for the form below:
(define mtx (make-mutex))
(lambda (ITEM)
(lock-mutex mtx)
(let* ((val (hashx-ref atom-hash atom-assoc cache ITEM))
(rv (if val val
(let ((fv (AFUNC ITEM)))
(hashx-set! atom-hash atom-assoc cache ITEM fv)
fv))))
(unlock-mutex mtx)
rv))
but still got an exception. However, this time the exception was not in afunc
but was instead this:
ice-9/boot-9.scm:1655:16: In procedure raise-exception:
In procedure cog-outgoing-set: Wrong type argument in position 1 (expecting opencog atom): #<Invalid handle>
In ice-9/boot-9.scm:
2792:4 5 (save-module-excursion _)
4336:12 4 (_)
In annotate-all.scm:
695:8 3 (run-all)
In annotation/main.scm:
119:19 2 (annotate-genes _ _ _)
In ice-9/threads.scm:
289:22 1 (loop _)
In ice-9/boot-9.scm:
1655:16 0 (raise-exception _ #:continuable? _)
line 118 of main.scm
is
[result (par-map (lambda (x) (x)) fns)] )
I don't understand where the cog-outgoing-set
is ... it must be in one of the fns
...
Ohh.. I see .. its in run-query
-- one thread is deleting the SetLink
that another thread is looking at. So run-query
itself is not thread-safe.
OK, I got par-map
to work with the following two patches. It does reduce the total elapsed time to obtain an answer, but it splurges on CPU time to do it -- it provides a 1.5x speedup in exchange for 2.6x more CPU time. Detailed performance measurements in https://github.com/MOZI-AI/annotation-scheme/issues/141#issuecomment-596038145
The patches: one for the atomspace:
--- a/opencog/scm/opencog/base/atom-cache.scm
+++ b/opencog/scm/opencog/base/atom-cache.scm
@@ -56,13 +56,17 @@
"
; Define the local hash table we will use.
(define cache (make-hash-table))
+ (define mtx (make-mutex))
(lambda (ITEM)
- (define val (hashx-ref atom-hash atom-assoc cache ITEM))
- (if val val
- (let ((fv (AFUNC ITEM)))
- (hashx-set! atom-hash atom-assoc cache ITEM fv)
- fv)))
+ (define (do-memoize)
+ (define val (hashx-ref atom-hash atom-assoc cache ITEM))
+ (if val val
+ (let ((fv (AFUNC ITEM)))
+ (hashx-set! atom-hash atom-assoc cache ITEM fv)
+ fv)))
+
+ (with-mutex mtx (do-memoize)))
)
; ---------------------------------------------------------------------
and one locally:
--- a/annotation/util.scm
+++ b/annotation/util.scm
@@ -32,6 +32,7 @@
#:use-module (ice-9 regex)
#:use-module (srfi srfi-1)
#:use-module (ice-9 match)
+ #:use-module (ice-9 threads)
#:export (create-node
create-edge)
)
@@ -164,6 +165,7 @@
((single) single)
((first second . rest) second))))
+(define run-query-mtx (make-mutex))
(define-public (run-query QUERY)
"
Call (cog-execute! QUERY), return results, delete the SetLink.
@@ -171,12 +173,21 @@
"
; Run the query
(define set-link (cog-execute! QUERY))
- ; Get the query results
- (define results (cog-outgoing-set set-link))
- ; Delete the SetLink
- (cog-delete set-link)
- ; Return the results.
- results
+
+ (lock-mutex run-query-mtx)
+ (if (cog-atom? set-link)
+ ; Get the query results
+ (let ((results (cog-outgoing-set set-link)))
+ ; Delete the SetLink
+ (cog-delete set-link)
+ (unlock-mutex run-query-mtx)
+ ; Return the results.
+ results)
+ ; Try again
+ (begin
+ (unlock-mutex run-query-mtx)
+ (run-query QUERY))
+ )
)
(define (do-find-name GO-ATOM)
BTW, there is a bug w/ the makefile: annotation makefile doesn't realize that the atomspace scheme files have changed, and so doesn't recompile, and ends up using the old version of the atomspace. You have to explicitly rm -r build
and also rm -r ~/.cache/guile
to get the new code to get used.
@linas After merging PR #156, the annotation fails with the following error:
In procedure append: Wrong type argument in position 2 (expecting empty list): #<unspecified>
.
The stacktrace points to this line in the code as source of error.
I am running the following annotation function:
(annotate-genes (list "TSPAN6" "NDUFAF7" "RBM5" "SLC7A2" "NDUFAB1" "DVL2" "SKAP2" "DHX33" "MSL3" "BZRAP1" "GTF2IRD1" "IL32" "RPS20" "SCMH1" "CLCN6" "RNF14" "ATP2C1" "IGF1" "GLRX2" "FAS" "ATP6V0A1" "FBXO42" "JADE2" "PREX2" "NOP16" "LMO3" "R3HDM1" "ERCC8" "HOMER3" "USE1" "OPN3" "SZRD1" "ATG5" "CAMK2B" "MPC1" "MRPS24" "ZNF275" "TAF2" "TAF11" "IPO5" "NDUFB4" "DIP2B" "MPPED2" "IARS2" "ERLEC1" "UFD1L" "PDCD2" "ACADVL" "ENO1" "FRYL" "SEC31B" "KIFAP3" "NT5C2" "GPC4" "ITGA8" "PPP2R5C" "RBFOX1" "ITM2A" "NRD1" "VDAC3" "CBFA2T2" "FKBP7" "SAR1A" "DUSP13" "PGR" "EPB41L3" "OXCT1" "SLC27A5" "WBP11" "NCOA1" "MAPRE3" "MGST2" "DIMT1" "RBM22" "TMED2" "HUWE1" "NLK" "UIMC1" "GNAS" "COQ9" "NSFL1C" "CFAP61" "TASP1" "MRPS33" "NDUFB2" "TXNL1" "MYL6" "HDAC6" "DHPS" "CREM" "PSMD8" "CIRBP" "HNRNPM" "SF3A1" "POLR2F" "HMGXB4" "CHKB" "ZMAT5" "RBM23" "VTI1B" "TIMM9" "GSTZ1" "RPS6KA5" "PSMB5" "RAB5IF" "PFDN4" "PSMA7" "NDUFAF5" "ATP1B4" "ALG13" "SUV39H1" "SCML2" "PGK1" "KLF5" "TSC22D1" "MGRN1" "SLC7A6" "CMC2" "CPPED1" "MYEF2" "CPQ" "LEPROTL1" "PPP2CB" "KLHDC4" "POP4" "AKT2" "RABAC1" "CARD8" "PON2" "SSBP1" "BUD31" "MEST" "CHCHD3" "COA1" "BLVRA" "PLGRKT" "BAG1" "EXOSC3" "RASSF4" "KAZALD1" "PITRM1" "EBF3" "LGI1" "MTMR4" "CDK5RAP3" "ENO3" "ICAM2" "EZH1" "MRPL27" "HDAC5" "DUSP3" "DCUN1D4" "NDUFC1" "ZBTB16" "COMMD9" "ATP5B" "ELK3" "ALDH2" "STX2" "GPR133" "MRPL51" "GAPDH" "TPI1" "TMEM14C" "GCNT2" "NCOA7" "FANCE" "E2F3" "ACOT13" "COX7A2" "ENPP5" "PCDHB2" "TMCO6" "TTC1" "POLR3G" "BNIP1" "TIMMDC1" "BCL6" "FAM162A" "PRKAR2A" "TUSC2" "SSR3" "MOB1A" "NCL" "MPV17" "HSPE1" "ZNF142" "ID2" "PNO1" "TMEM59" "LAMTOR2" "LEPR" "CTH" "TMEM9" "MRPS15" "SDHB" "FAAH" "DPH5" "C1orf54" "ANKRD13C" "VAMP8" "NDUFB3" "GDA" "MAPKAP1" "YPEL5" "C10orf76" "RCL1" "GRIA2" "PCMT1" "KIAA1217" "PAIP2" "ARL1" "SOCS2" "ECHDC2" "A1BG" "ZNF211" "GTDC1" "CCDC18" "HNRNPA2B1" "FAM126A" "CLTA" "CISD1" "CDKN2C" "RASSF8" "ATP7B" "ITIH5" "SMUG1" "NDUFAF4" "PLA2G12A" "PFKFB2" "ATP5E" "STAMBP" "SNRNP27" "NQO2" "EMC3" "TMTC4" "SNRPD2" "ID1" "KDM5C" "ST3GAL3" "EMC1" "UQCR11" "RNF6" "SHFM1" "STEAP4" "RBM48" "TUBGCP6" "EMC4" "RABL5" "MTX2" "TXNDC17" "DAD1" "ECSIT" "CDC16" "RPL36" "MRPL34" "LSM4" "CYP2E1" "ZNF337" "PRRC2B" "COX4I1" "NFATC1" "PDLIM4" "PSME3" "NDUFA2" "RAF1" "ENOSF1" "MRPL35" "SERPINF1" "GRSF1" "WBP2" "PRMT7" "POMP" "MYH8" "BEX2" "ACTR3B" "ARL8B" "EMC7" "LAMTOR5" "KCTD1" "DDB2" "PUM1" "NREP" "CTSL" "FAM189A2" "MDFIC" "CAPRIN1" "CD63" "TSPAN31" "MRPL44" "NDUFB5" "TANK" "VPS45" "PSMB7" "UBAP2" "GRHPR" "BPHL" "UQCC2" "NUMA1" "DCUN1D5" "UNC13C" "TUBGCP4" "SLC28A2" "CYP1B1" "COX17" "PARP16" "HERC5" "PPA2" "LGR5" "NEDD1" "SLAIN1" "GRTP1" "TMX1" "SERF2" "SRP14" "WDR61" "TPM1" "SEC11A" "PMM2" "MYLK3" "CLTC" "GAREM" "GREB1L" "RNF165" "TXNL4A" "NFIC" "PSMB6" "PSMA5" "CELSR2" "TIPRL" "UFC1" "SETDB1" "ADAMTSL4" "JTB" "HAX1" "ACP1" "NVL" "DEGS1" "PSEN2" "PDIA6" "SFXN5" "ZNF385B" "UBR3" "GULP1" "ATG3" "SRPRB" "TMEM108" "SLIT2" "DDIT4L" "PAM" "TSLP" "ATG12" "BOD1" "MYLK4" "DYNLT1" "PSPH" "ATXN7L1" "ZMYM3" "RADX" "ARHGAP36" "HMBOX1" "PXDNL" "GOLGA7" "POLR2K" "EIF3H" "NDUFB9" "TATDN1" "C9orf3" "FAM171A1" "FAM188A" "GSTO1" "EIF3M" "ARFGAP2" "DAK" "CPSF7" "CABLES2" "SAP18" "HNMT" "TIMM8B" "PTS" "NDUFC2" "BICD1" "ZNF385D" "GEMIN6" "ATP5A1" "CCDC50" "UTRN" "ZNF117" "RASSF3" "HNRNPU" "TRIP12" "LGI4" "MSI2" "UCHL1" "ATP5G3" "TCEB1" "PSMA8" "DPH3" "ACSS1" "TMEM55A" "GOLGA7B" "CNOT8" "NUP205" "C9orf85" "CARNMT1" "KCNMA1" "SCAF4" "MALSU1" "PDE6D" "MUM1L1" "AFAP1L1" "WDR19" "MRPL17" "CNNM4" "ZFAND2B" "AGPAT6" "SNF8" "CBR1" "TPPP3" "CCDC28B" "SSU72" "PCNT" "PCYT1A" "COX7A1" "BCL6B" "POLR3K" "NXF1" "OLFML2B" "MRPL55" "RFTN2" "VSNL1" "RPRD2" "BOLA3" "MRPS18C" "ARPC2" "SUCLG1" "PPM1L" "ADAMTS9" "ELP6" "SMIM12" "SFMBT1" "RAD54L2" "HPGD" "ARFIP1" "UQCRQ" "HCN1" "IQUB" "SUN1" "FAM86FP" "DCAF13" "C9orf89" "NDUFB6" "CFL2" "KLHDC2" "TRUB1" "ZFYVE1" "PSMC3" "DPCD" "ALKBH3" "CCT2" "ZNF202" "USP54" "C11orf74" "NDST2" "SEC11C" "CENPV" "HSP90B1" "AP1G1" "PPIB" "FAM96A" "SMAD3" "PDIA3" "FBXO22" "ATP5L" "GPX4" "ATP5H" "POLR2G" "COPS6" "RAB4A" "DNAJC7" "COA6" "MMADHC" "MPLKIP" "HEXIM2" "COL3A1" "CNNM3" "USP39" "RNF181" "LRRC28" "PPIC" "STXBP6" "MFF" "ATP5I" "PARM1" "NSMCE1" "EFNA1" "CRADD" "SIN3A" "CNBP" "ZNF32" "TMEM266" "TOR1AIP2" "BRD3" "SF3B5" "STX8" "UBB" "DNAJC18" "NUDCD2" "GPR27" "KBTBD2" "TRIAP1" "MTSS1" "ZNF160" "FEZ2" "FAM86JP" "RBKS" "EXOSC1" "PDE7B" "MRPL36" "BPTF" "ZNF540" "EXOSC10" "C20orf196" "FRMD3" "TP53RK" "SYNPO2" "MYEOV2" "MRPL52" "TMEM134" "BNC2" "KDM2A" "RHOD" "COMMD1" "GLRX" "DMRT2" "FAM86B3P" "MINOS1" "UQCRH" "PHC3" "USMG5" "HOXB2" "NRROS" "MSRB3" "MIR1-1HG-AS1" "SNAPC5" "C12orf76" "MRPL11" "IL20RB" "AKIRIN1" "TMEM167A" "NDUFA11" "FAM211A-AS1" "CADM2" "LSM1" "TMEM9B" "SCUBE2" "UCP3" "PAAF1" "MRPL48" "LINC00116" "RTTN" "KCMF1" "RMDN1" "IRX5" "C8orf4" "MAMSTR" "POLE" "FAM210A" "ATOX1" "KIAA0195" "MLF1" "GRAMD1C" "BOLA1" "TMEM11" "RIIAD1" "SEPW1" "MAGED1" "FCER1A" "PSMG4" "SSR4" "TRAPPC5" "PLAG1" "LSM10" "COA4" "NEUROG1" "MRPS11" "MRPS16" "FAM104B" "SATB1" "NDN" "RP11-181G12.2" "CNOT10" "ZNF662" "LSMD1" "CEP57L1" "SFXN4" "FAM162B" "DENND5A" "UBE2F" "PCDH9" "MROH7" "NDUFA12" "APOO" "PTRHD1" "RPS27L" "ADSSL1" "UBALD2" "ROR1" "NR2F2" "PSMD13" "ANKFY1" "ZNF529" "POLR1D" "TOR3A" "PPP1CC" "RGS9BP" "KRT10" "GPATCH8" "RPS19BP1" "CMC1" "MAGI2" "EXD3" "MORN2" "COL4A5" "COMMD6" "S100A16" "LINC00319" "RNFT1" "HN1" "C15orf61" "BLOC1S2" "LCOR" "XPNPEP3" "AC005154.6" "ACN9" "TECPR2" "TOMM7" "ADA" "ZNF252P" "NOP9" "CASP4" "METTL9" "ZNF398" "ZNF682" "MRPL21" "HTT" "COL4A6" "PHF2" "FAM118B" "FAM49A" "MRPL42" "UBL5" "CCDC69" "NCOA6" "MT-ND5" "ZNF521" "MT-ND3" "SELT" "OSTC" "TSEN15" "MT-ND4" "DCLRE1A" "CCDC167" "DMD" "SNORA63" "SNORD51" "TATDN3" "FAM229B" "INPP5B" "COL15A1" "ZBTB48" "ZNF783" "C12orf73" "FLJ00104" "SARNP" "DNAJC19" "HCG27" "SNORA12" "SNORD70" "U3" "SNORD91B" "U3" "SUPT4H1" "ANKRD39" "ATP6V0CP3" "NDUFS3" "ZNF337-AS1" "SLC23A3" "ARL16" "ZSWIM7" "COL28A1" "SLC35E2" "TENM3" "FAM19A5" "RPA3-AS1" "SNORD99" "PSMD4P1" "MIR503HG" "PRKAR2A-AS1" "CEP57L1P1" "LINC01167" "CETN4P" "LINC00378" "RAB6C-AS1" "TPI1P1" "LINC00354" "PCGEM1" "OST4" "MSX2P1" "EEF1DP3" "HSBP1" "TRAF3IP2-AS1" "MCTS1" "SEC1P" "DICER1-AS1" "CDKN2AIPNL" "KCND3-AS1" "FAM200B" "GOLGA2P5" "RPL13P5" "ADAMTS9-AS1" "LINC00882" "MRPL33" "STMP1" "NME1-NME2" "UBE2V1" "RASSF8-AS1" "SOCS2-AS1" "OIP5-AS1" "TMEM161B-AS1" "UBA6-AS1" "SMIM20" "ROPN1L-AS1" "SNORA70" "GS1-251I9.4" "IFNG-AS1" "AGAP2-AS1" "APOPT1" "CUX1" "LINC00592" "UBE2F-SCLY" "LINC00662" "KCNJ2-AS1" "GOLGA7B" "NCBP2-AS2" ) "agingSwitchSymbols" "[{\"function_name\": \"gene-pathway-annotation\", \"filters\": [{\"filter\": \"pathway\", \"value\": \"smpdb reactome\"},{\"filter\": \"include_prot\", \"value\": \"False\"}, {\"filter\": \"include_sm\", \"value\": \"True\"},{\"filter\": \"coding\", \"value\": \"False\"},{\"filter\": \"noncoding\", \"value\": \"True\"}, {\"filter\": \"biogrid\", \"value\": \"1\"}]}, {\"function_name\": \"gene-go-annotation\", \"filters\": [{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}, {\"filter\": \"parents\", \"value\": \"0\"}, {\"filter\": \"protein\", \"value\": \"False\"}]}, {\"function_name\": \"include-rna\", \"filters\": [{\"filter\": \"coding\", \"value\": \"False\"},{\"filter\": \"noncoding\", \"value\": \"True\"},{\"filter\": \"protein\", \"value\": \"0\"}]},{\"function_name\": \"biogrid-interaction-annotation\", \"filters\": [{\"filter\": \"interaction\", \"value\": \"Genes\"}]}]")
Is this new failure with or without the par-map
? If with, can you try without? Or rather, par-map should just be removed ... it costs more than it delivers ...
Is this the same list of genes as before, or different? Same annotation function as before, or different?
Actually the issue is with the condition at https://github.com/MOZI-AI/annotation-scheme/blob/cc9f0f952af3ee7fe07beb07c370346f62837fa8/annotation/gene-pathway.scm#L112
It should return an empty list when prot?
is false.
the biogrid-interaction-annotation
function is also not working. You can test it even with a single gene. I set the interaction argument to Genes
Actually the issue is with the condition at
Ah, OK .. well ... umm... patch it then, don't wait for me to do it...
I’m just letting you know :). I will send a PR with the fixes
As an update, I was able to annotate 745 genes using a custom atomspace. And it took around 4 minutes.
However, now I am facing an issue trying to annotate gene IGF1
using the sample dataset (tests/sample_dataset.scm
).
After loading the sample dataset, I run the annotate-genes
function as follows:
(annotate-genes (list "IGF1") "igf1" "[{\"function_name\": \"gene-pathway-annotation\", \"filters\": [{\"filter\": \"pathway\", \"value\": \"smpdb reactome\"},{\"filter\": \"include_prot\", \"value\": \"False\"}, {\"filter\": \"include_sm\", \"value\": \"True\"},{\"filter\": \"coding\", \"value\": \"False\"},{\"filter\": \"noncoding\", \"value\": \"True\"}, {\"filter\": \"biogrid\", \"value\": \"1\"},{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}]}, {\"function_name\": \"gene-go-annotation\", \"filters\": [{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}, {\"filter\": \"parents\", \"value\": \"0\"}, {\"filter\": \"protein\", \"value\": \"False\"}]}, {\"function_name\": \"include-rna\", \"filters\": [{\"filter\": \"coding\", \"value\": \"False\"},{\"filter\": \"noncoding\", \"value\": \"True\"},{\"filter\": \"protein\", \"value\": \"0\"}]},{\"function_name\": \"biogrid-interaction-annotation\", \"filters\": [{\"filter\": \"interaction\", \"value\": \"Genes\"}, {\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}]}]")
This fails with the following error:
In procedure cog-outgoing-set: Wrong type argument in position 1 (expecting opencog atom): #<Invalid handle>
I have attached the full stack trace. stacktrace.txt
P.S I have rebuilt the docker image from scratch so that the latest atomspace version is used for annotation
Is this with par-map in main.scm
, or with regular map?
@linas it is with par-map
Please read everything I wrote above about par-map. I wrote a lot about it. To be clear: I strongly advise against using par-map
in main.scm
.
OK, I got par-map to work with the following two patches. It does reduce the total elapsed time to obtain an answer, but it splurges on CPU time to do it -- it provides a 1.5x speedup in exchange for 2.6x more CPU time.
I thought with changes you made to the atomspace, the issue with par-map
was fixed, no?
I made no changes at all to the atomspace; I strongly recommend against using par-map. It is expensive: it gives you a 1.5x elapsed-time speed-up for 2.6x more CPU usage -- that's rather inefficient. If you really want to do this anyway, to reduce elapsed time, then I'm mostly thinking that the locks belong in the annotation code...
Independent of this particular issue, there are two generic problems I'm seeing with guile. Maybe @rekado can impart words of wisdom?
par-map
and par-for-each
when used with the atomspace are always very inefficient, or worse. In the learn
project, I was seeing 1.5x speedup for 2 cpus, 1.2x speedup for 3 cpus, a slight slowdown maybe 0,.9x for 4, and a massive slowdown/deadlock for anything more. I don't understand why, because simple guile multi-thread experiments don't have this problem, and multi-threaded atomspace doesn't have this problem. I'm guessing it has to do with the very high-frequency cross-overs guile->c++ and back. Some kind of lock contention, there.
In this code base -- annotation-scheme -- I'm seeing 50% of CPU time in GC, when running single-threaded. This seems like way too much. It might be due to all of the lists and append-maps and etc. (In the learn project, I was seeing 5% or 10% I think, not sure, and less than that in the benchmarks, but was not really payig attention there.) I don't know why the GC is so high, and so don't know how to reduce it.
I only see the high GC use when annotating the 681 genes -- for the single IGF1 gene, I see this:
Elapsed: 75.208 secs. Rate: 55.85 gc/min %cpu-GC: 6.232% %cpu-use: 101.3%
I'm guessing that, for the 681 genes, the code creates extremely long lists, and using append-map causes those lists to be thrashed. I'm thinking that map and append-map need to be replaced by fold... if so, this suggests another 2x speedup might be possible!
OK, so @Habush pull req #161 adds all the locks that you need. No changes to the atomspace needed.
Closing because I think #161 fixes this. Open new report if there are remaining issues.
I am trying to annotate BRCA1 genes and when I run the
annotate-genes
function, it fails with the following error:In procedure cog-outgoing-set: Wrong type argument in position 1 (expecting opencog atom): #<Invalid handle>
The error happens in the
run-query
function. It looks like the issue is due the pubmed id caching and may it is referring to a deleteSetLink
(I could be wrong though)To reproduce the issue, load the datasets and run
annotate-genes
function with the following arguments(annotate-genes (list "TSPAN6") "agingSymbols" "[{\"function_name\": \"gene-pathway-annotation\", \"filters\": [{\"filter\": \"pathway\", \"value\": \"smpdb reactome\"},{\"filter\": \"include_prot\", \"value\": \"True\"}, {\"filter\": \"include_sm\", \"value\": \"False\"},{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"}, {\"filter\": \"biogrid\", \"value\": \"0\"}]}, {\"function_name\": \"gene-go-annotation\", \"filters\": [{\"filter\": \"namespace\", \"value\": \"biological_process cellular_component molecular_function\"}, {\"filter\": \"parents\", \"value\": \"0\"}, {\"filter\": \"protein\", \"value\": \"True\"}]}, {\"function_name\": \"include-rna\", \"filters\": [{\"filter\": \"coding\", \"value\": \"True\"},{\"filter\": \"noncoding\", \"value\": \"True\"},{\"filter\": \"protein\", \"value\": \"1\"}]},{\"function_name\": \"biogrid-interaction-annotation\", \"filters\": [{\"filter\": \"interaction\", \"value\": \"Proteins\"}]}]")
@linas can you check this please?