coduin / epiphany-bsp

BSP implementation for the Parallella; the world's smallest supercomputer
https://jwbuurlage.github.io/epiphany-bsp/
GNU General Public License v3.0
27 stars 14 forks source link

Hello World example freezes #31

Closed 222464 closed 8 years ago

222464 commented 8 years ago

Hello,

I tried the hello world example, but it freezes and does nothing as soon as the bsp code is added. I added a print before the bsp code, which doesn't go off. If I remove the bsp code though, the print works.

Any ideas?

jwbuurlage commented 8 years ago

Hi,

That is very strange. Are you changing the host code or the kernel code? What Parallella board version do you have? Do you have the most recent ESDK version installed?

222464 commented 8 years ago

Hello,

I have the latest version as of a week ago. ESDK version is 2015.1 (I also tried the other one that came on the board).

jwbuurlage commented 8 years ago

Do you have the 64-core version of the Parallella board by chance? Note that we do not support this at the moment, because we do not have access to one, I will make this more explicit in the documentation.

Otherwise, you experience the host program hanging after running ./bin/hello after the execution of which bsp function? Let us try to narrow this down.

Unfortunately I cannot replicate the behaviour for now, we have recently tested the library on two fresh Parallella boards with the latest ESDK version and experienced no difficulties.

222464 commented 8 years ago

I am using the 16 core version. It doesn't hang after any particular function. It hangs before the main is execute somehow, but only if bsp functions are present in the code. If I comment out the bsp functions then the rest of the main runs (before and after the bsp calls). It's really strange.

jwbuurlage commented 8 years ago

Okay, that is very strange indeed. Some things we can try:

jwbuurlage commented 8 years ago

Have you had any luck so far? I am curious to know what causes this. Thanks!

222464 commented 8 years ago

I decided to revisit my parallella, and am still running into this issue. I narrowed it down to the bsp_begin(16); call. It doesn't ever return, it just freezes the program.

222464 commented 8 years ago

I should add that the included demos freeze as well!

Tombana commented 8 years ago

Hi,

Do you mean the bsp_begin(16); call on the host program or on the epiphany program?

You can also try to compile in debug mode:

If you then try to run the examples they should produce more debug output.

222464 commented 8 years ago

Here is what I got when running hello:

(BSP) INFO: Making a workgroup of size 4 x 4 (BSP) INFO: Loading: /home/parallella/libraries/bsp/epiphany-bsp/examples/bin/hello/e_hello.srec

Then it freezes.

I am referring to the bsp_begin in the host program.

Tombana commented 8 years ago

Hi 222464,

Unfortunately we can't find the issue at the moment. Do you have any similar issues with other libraries for the Parallella?

We would be very grateful if you could help us debug the issue by debugging the bsp_begin function. If you checkout the develop branch (git checkout develop) and then look in src/host_bsp.c at lines 122-150, you will find the "(BSP) INFO: Loading: ....srec" debug line that your program did output. Could you try to add some more printf's there to help us localize the problem? For example, does it reach ebsp_malloc_init() ? After adding printf's, you can recompile the library by running make -B in the root directory, and then you can run make -B in the examples directory to rebuild the examples.

222464 commented 8 years ago

I have narrowed the problem down the the e_load_group call in bsp_begin (host).

Tombana commented 8 years ago

Thanks for your help. The e_load_group function is part of the ESDK so that suggests that the problem may lie somewhere else. Have you tried running the samples that come with the ESDK?

As for further debugging: I believe that the ESDK gives more debug output if you call the following functions before e_load_group: e_set_host_verbosity(H_D4) e_set_loader_verbosity(H_D4) You could try to add those two calls to bsp_begin, somewhere before e_load_group and see what happens.

222464 commented 8 years ago

I think my parallella is busted. I can't run any of the examples.

I did a fresh install, and this is what I get when running the hello-world example in ~/epiphany-examples/apps:

0: Message from eCore 0x8ca ( 3, 2): "" 1: Message from eCore 0x84b ( 1, 3): "" 2: Message from eCore 0x84b ( 1, 3): "" 3: Message from eCore 0x888 ( 2, 0): ""

and then it crashes the whole parallella.

So, it is probably not the fault of the BSP library :)

jwbuurlage commented 8 years ago

Hi, I am sorry that the problem seems to be with your hardware. Because it seems that the problem is not with the EBSP library, I will close this issue for now.