Closed YinghuiShao closed 3 years ago
Hi, thanks for raising this issue and bringing it to our attention.
For us to help you, we need some more details about your system setup, and we need to have a minimal reproducible example. I was not able to reproduce this issue in my environment yet. Can you let us know what is your host system, compiler/OS version, and how you built QEMU and Qflex, so we can make progress on narrowing down the source of the problem? Also, was this error generated when trying to run the image we provide with our tutorial?
Regarding your second question, I am not totally clear about the order of tools you used. The correct way of building the system is to first install the dependencies, then build the PTH library, and then build QEMU & QFlex. Otherwise, QEMU would not actually build. Can you let us know the process you went through to actually build the tools successfully and then run them?
Hi, thanks for raising this issue and bringing it to our attention.
For us to help you, we need some more details about your system setup, and we need to have a minimal reproducible example. I was not able to reproduce this issue in my environment yet. Can you let us know what is your host system, compiler/OS version, and how you built QEMU and Qflex, so we can make progress on narrowing down the source of the problem? Also, was this error generated when trying to run the image we provide with our tutorial?
Regarding your second question, I am not totally clear about the order of tools you used. The correct way of building the system is to first install the dependencies, then build the PTH library, and then build QEMU & QFlex. Otherwise, QEMU would not actually build. Can you let us know the process you went through to actually build the tools successfully and then run them?
I want to run Matrix Multiplication with Qflex to test determinstic feature.
My host system is "x86_64 GNU/Linux" and my linux version is "Ubuntu 18.04.2 LTS" .
This error was generated when trying to run the image you provide with tutorial. I strictly followed the instructions on https://qflex.epfl.ch/download/ --Example: Running Matrix Multiplication with QFlex – Timing (KnottyKraken). Downloading QFlex and setting up the environment • Clone the main QFlex repository along with all the submodules in the $QFLEX directory.
$ git clone https://github.com/parsa-epfl/qflex --recurse-submodules
After this step, I cloned most of qflex codes except the "qemu/dtc" and "qemu/roms" for "connection timed out" error. • Go to qemu directory, install required dependencies and build QEMU with the -timing option (README).
• Go to flexus directory, install required dependencies and build the KnottyKraken simulator (README). o Build KnottyKraken using cmake -DSIMULATOR=KnottyKraken . && make -j. • Go to images directory, checkout the matmul branch and setup the image (README).
I ran "$home/qflex/qemu/scripts/snap-manager.py --qemu-img-cmd-path $home/qflex/qemu update $home/qflex/images/ubuntu16/ubuntu.qcow2"(img_cmd) and an error occurred.So I first buildt qemu with emulation mode and then ran img_cmd ,finally I built qemu with timing mode.
• Go to scripts/captain directory ($CAPTAIN), and setup the config parameters (README). o Use simulation_type=timing along with the correct flexus_path and flexus_timing_path. o Use icount=on in config/system.ini. o Provide the correct user_postload path. An example file is given at config/user_postload. o Update the paths in config/user_postload corresponding to “To be updated” according to your setup. • Run the captain script to start simulation. o Create a $QFLEX/run directory to contain all the files produced during the run and cd $QFLEX/run. o Create an output directory (e.g. $QFLEX/run/output) to store the logs from the simulation. o Use echo 1 > $QFLEX/run/preload_system_width to provide the system width to KnottyKraken. (1 is the number of cores in matmul). o Run captain using $CAPTAIN/captain $CAPTAIN/config/system.ini -o $QFLEX/run/output.
Thanks for the reply - I see that you mainly followed the identical commands from the online tutorial. There are a few things to mention here:
libpth.so
. If you were not able to build PTH successfully, then naturally QEMU will not build. Also, it seems like you had errors in cloning some of the submodules due to network connections timing out. I suggest first ensuring that everything is properly checked out and installed before going on to the next step.I tried to replicate your segmentation fault error on a clean Ubuntu 18.04 environment, but I'm unable to reproduce the problem. Also, do I understand correctly that you first had a segmentation fault, but then after re-building PTH, there is now no error and you are able to run the simulator?
I suggest we narrow this problem down as much as possible, to find a minimal reproducible example. Perhaps we can do the following:
/path/to/qemu/pth
and running make test
.We can then investigate more afterwards. Thanks!
Hi @YinghuiShao, I see that you have edited the original issue with the following information. Please add new info as further comments below because then I can follow the discussion more clearly.
================== THREAD CONTEXT SWITCH =========================================== 23283:pth_sched.c:0320: Finished switch back to pth_sched stack 0x555556755350, size 65536, FROM stack 0x0, size 0 23283:pth_sched.c:0325: pth_scheduler: cameback from thread 0x5555568fdda0 ("unknown") 23283:pth_sched.c:0334: pth_scheduler: thread "unknown" ran 0.160900
This debug information indicates that the PTH scheduler is switching into its scheduler thread, from an unknown/unallocated stack (you can see the stack ptr is 0x0). We still don't have enough information to reproduce this problem. It is possible that this happens when the threading system is initialized for the first time, and thus it switches to the scheduler stack from the currently running hardware thread, which will not have allocated its stack from PTH. Can you tell me exactly your sequence of commands to gather this output behaviour, so I can attempt to reproduce the problem?
If you have solved this in your own way, as you indicated in #27, then we would really appreciate you opening a pull request in the PTH repository with an explanation of the problem, how to reproduce it, and the solution. We can then consider accepting it to our repo.
Cheers.
The pth stack overflow problem has been handled by increasing its stack size( pth_attr.c, line 92 in pth_attr_init function ).
Hi @YinghuiShao, I am re-opening this issue as I recently experienced a PTH stack overflow when running a simulation on a different platform than the one I previously used. I have a fix in the pipeline which I am testing now.
I'm closing this since it was fixed here: https://github.com/parsa-epfl/qemu/pull/58
When I ran the QFlex with timing mode, a segmentation fault occurred. I debugged with GDB but it doesn't work. How to solve this problem ? Run log is below.
I compiled pth with debug mode and error is below.
Additionally, build_qemu.sh calls build_pth.sh to generate libpth.so. I ran
./build_qemu.sh -timing
but an error hanppened.So I first ran
./build_pth.sh
then ran./build_qemu.sh -timing
and there was no error. Does the uncorrect order result in my first error “Segmentation fault”?