Open ghost opened 2 years ago
According to this page: https://www.gnu.org/software/guile/manual/html_node/Compilation.html the *.go files contain CPU-architecture-dependent code. There is a specific --target=target
flag on the guild compiler to specify the target architecture. That means, when building guile, you will need to set it up correctly for cross-compilation. I'm sure you did this when compiling the c files, but I am guessing the makefiles did not set the target correctly for the .scm->.go files. (But that is a guess.)
If/when I fix #2945 this might also present cross-compilation challenges. Not sure what to do about this ...
I'm also thinking that the __aeabi_idiv0
bug seen earlier is a side-effect of the .go files being architecture-dependent. That is, the .go files contain a kind of RTL (its either GNU Lightning or a derivative of that) and that RTL ("register transfer language" or "bytecode") is executed on arm7 by calling tiny little arm7 instruction stubs such as __aeabi_idiv0
... so this again suggests the *.go files need to be recompiled form arm7.
The above is just an educated guess, though. I could be wrong.
I just launch a-jsb.com for running javascript in sandbox with atomspace.
I just launch a-jsb.com for running javascript in sandbox with atomspace.
Wow. Well, that is unexpected! It looks like the execSCM
call worked, but I guess that this is an x86 version, and not arm7 ? I'm still very eager to get the arm7 issues figured out and fixed.
I compiled datomspace-tester.apk with more logs. Following are files:
I am stuck at following error:
At scm_load_startup_files ()
scm_c_primitive_load_path ("ice-9/boot-9");
...
fprintf(fh_vm, "scm_call_n #26\n");
fflush(fh_vm);
ret = vm_engines[vp->engine](thread, vp, ®isters, resume);
...
vp->resumable_prompt_cookie = prev_cookie;
fprintf(fh_vm, "scm_call_n #28\n");
fflush(fh_vm);
It repeats "scm_call_n #26", then "scm_call_n #28", then "scm_call_n #26" again several times. After that, it stopped.
It repeats "scm_call_n https://github.com/opencog/atomspace/pull/26", then
Yeah, that's going to be a hard way to debug. Poking through that stuff is like .. debugging assembly code. And anyway, I doubt that is where the bug is. Based on several of your tombstone files, the garbage collector was accessing bad memory, and so the question is "why is it doing that?" So, some background:
GC_malloc()
to get more memory.When the GC runs, it searches for pointers in all of the stacks and in any malloced RAM it knows about. It is not supposed to search outside of these boundaries. Yet, clearly, this is happening: in the first tombstone, it access memory about 300 bytes away from valid RAM, and in the second tombstone, only about 8K away. These offsets are tiny: both are less than 16-bits away from a valid address. I mean, out of a giant 4GB address space, it didn't access some "random" address, it access something really close by.
This less-than-16-bit mistake suggests to me that guile is using a 16-bit short for some offset. I am guessing that, due to architecture confusion, this offset is being added instead of subtracted. How could this happen? Here are my guesses:
$ guild disassemble ./srfi/srfi-1.go
44 (mov 1 7) at srfi/srfi-1.scm:830:11
45 (handle-interrupts)
46 (call 7 2)
48 (receive 4 7 9)
50 (immediate-tag=? 4 3839 4) ;; false? at srfi/srfi-1.scm:828:4
52 (jne 6) ;; -> L3
53 (scm-ref/immediate 5 8 1) at srfi/srfi-1.scm:837:17
54 (mov 4 5) at srfi/srfi-1.scm:837:11
55 (mov 5 8)
The mov
and jne
and call
are translated into arm7 pseudo-assembly: they are calls to functions such as __aeabi_movi
and __aeabi_jne
and whatever: these are very short subroutines in the arm7 libc.so
that are just wrappers for one or two arm7 assembly instructions. It is possible that maybe this translation is incorrect.
I asked the guile gurus about about arm7 on IRC chat. They said it works fine on Android. They said "just install guix, you'll see" (guix is a guile linux distro.) So, here's how we can check this:
A. Install a terminal emulator on the phone
B. run the guile shell on the phone, from the terminal emulator. Its in /sdcard1/something/bin/guile
C. At the guile prompt, run some scheme commands:
(+ 2 2)
(display "hello world\n")
(gc)
(gc-stats)
The (gc)
call forces GC to run, and (gc-stats)
prints some statistics. All sizes are in bytes, times are in nanoseconds, something like that. All of this should work. If this does NOT work ... then ... let me know.
I could not do this myself, because running /sdcard1/something/bin/guile
complained that it was unable to find libsomething.so
and so the whole C shared library environment needs to be set up.
If the above does work, then try
(use-modules (opencog))
(Concept "foo")
(gc)
(gc-stats)
If that works, then ???
I wrote the above before reading through your files. I'll read your files shortly.
again several times. After that, it stopped.
Did it hang, or did it crash? If it hangs, did you look at the cpu usage? Is the CPU usage 100% or 0% -- If it's 100%, then it is probably trying to compile ice-9/boot-9
which could take minutes or hours .. or days?
If it's hung, but there is no CPU usage, then .. ugh. We'd have to use gdb. But first, please check everything I mentioned earlier.
SchemeEval run smoothly on x64 and i386 but it crashes on armv7-a.
Steps to reproduce bug
Download datomspace-tester.apk
Install datomspace-tester.apk (Do not run!)
Go to Settings -> Apps -> dAtomSpace Tester. Set Storage permission.
Run dAtomSpace Tester
App runs about 30 seconds, then it crashes. There is tombstone file.
View results in datomspace-test.txt file in Download folder
Source codes
dAtomSpace Tester
dAtomSpace
SchemeEval.java
com_cogroid_atomspace_SchemeEval.h
com_cogroid_atomspace_SchemeEval.cc
Tester.java