Closed taijusti closed 9 years ago
after a day and a half of randomly moving stuff around and making minor structural changes, HLS still does not interpret the code properly. i'm gonna do a major restructuring on the loops of host(). fingers cross that with a different (simpler?) implementation of host, HLS will actually interpret it right
not too sure what is going on... the solution to SMO should be unique since its a convex optimization problem...
i tried to change the algorithm a little so that itll only stop when there are no kkt_violators. even then, the hw solution has about 10% error rate whereas the sw solution has a 5% error rate.
going to go back and try moving other things around without changing the algorithm itself
ok.. i think i have a solution that works based on what im seeing in the analysis tab. i forgot to check in my test that im running on zynq, so ill go in tomorrow to see if my solution works. fingers crossed
sigh... ok, ive transformed the code to this:
send(COMMAND_INIT_DATA, out);
for (i = 0; i < ELEMENTS; i++) {
send(data[i], out);
send(y[i], out);
}
callDevice(in, out);
send(COMMAND_GET_KKT, out);
callDevice(in, out);
recv(kkt_violators, in);
and you never see send(COMMAND_GET_KKT, out); execute in hw. my random guess is that the blocking receive is happening before the write, causing a deadlock
do you mean that recv(kkt_violators, in); is happening before the send(COMMAND_GET..) ? Did you try co-sim for the host ?
On Sun, May 31, 2015 at 4:04 PM, Justin notifications@github.com wrote:
sigh... ok, ive transformed the code to this:
send(COMMAND_INIT_DATA, out); for (i = 0; i < ELEMENTS; i++) { send(data[i], out); send(y[i], out); } callDevice(in, out);
send(COMMAND_GET_KKT, out); callDevice(in, out); recv(kkt_violators, in);
and you never see send(COMMAND_GET_KKT, out); execute in hw. my random guess is that the blocking receive is happening before the write, causing a deadlock
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-107240529 .
do you mean that recv(kkt_violators, in); is happening before the send(COMMAND_GET..) ?
that's my current theory right now
Did you try co-sim for the host ?
no, i'll consider doing it.
lifting the kkt command above the init as an experiment... absolutely nothing runs now!
send(COMMAND_GET_KKT, out); callDevice(in, out); recv(kkt_violators, in);
send(COMMAND_INIT_DATA, out); for (i = 0; i < ELEMENTS; i++) { send(data[i], out); send(y[i], out); } callDevice(in, out);
confirmed that it is lifting the recv() above the send(). i did this by sending an arbitrary value to the host to get it past the recv. i then saw the send() come out of host. going to ask on xilinx forums how to get around this since i have no idea myself. i will work on cleanup while im waiting. after that, ill take a look at ethernet if there is nothing else.
known solution found/confirmed. applying fix to master branch.
wow great news, what is the fix then ?
On Mon, Jun 1, 2015 at 2:02 PM, Justin notifications@github.com wrote:
known solution found/confirmed. applying fix to master branch.
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-107656056 .
I just saw the response on the forum. its annoying that you spent that much time on it, there should be proper documentation on such stuff.
it turns out its sort of documented under the protocol directive. it just wasn't obvious to me that's what i needed to do. they still need to add more documentation.
i should mention even though i know what the solution is, its still super painful -.-. have to define the order of operations cycle by cycle. also, vivado seems to move ap_waits() around... not too sure whats going on.
What do you mean by defining the order of operation cycle by cycle? I thought you will just add a protocol directive to blocks that perform write then read and insert a wait between them, right ? so you would only do minimal changes. Vivado should not move ap_wait if it is in a protocol block right ? that is what the user guide says.
unfortunately it does seem to move the waits around. maybe im doing something wrong. ive already tried putting only one wait between all writes and reads and it does nothing.
I am not sure if it makes any difference but are using ap_utils.h or systemc.h. In the user guide they say the following. • In C by using an ap_wait() statement (include ap_utils.h) • In C++ and SystemC designs by using the wait() statement (include systemc.h).
i was using ap_utils. i just tried systemc.h and it gives many compile time libraries within the h files. The first of which is:
@E [HLS-70] Compilation errors found: In file included from host/host.cpp:12: In file included from /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/systemc.h:244: In file included from /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/systemc:72: In file included from /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/sysc/kernel/sc_module.h:37: In file included from /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/sysc/kernel/sc_wait.h:32: In file included from /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/sysc/kernel/sc_simcontext.h:34: /opt/Xilinx/Vivado_HLS/2014.4/lnx64/tools/systemc/include/sysc/utils/sc_hash.h:40:19: error: declaration of anonymous class must be a definition template<class K, class C> //template class
ap_fifo is inded injecting waits, its just that vivado doesn't seem to respect them and moves them around. i think its because it only works for primitive accesses, not functions.
ok... out of desperation, i've asked the resident hls expert in our group. she has a solution i think will work that i'm going to try to implement now. it's completely different from this protocol/ap_wait thing.
ok... i have a solution which looks good on the analysis tab (which i could not say about the other solutions). i will try it first thing tomorrow. fingers crossed guys...
ok, this time the solution works for sure! just going through the normal process of making sure i implemented it right everywhere. hopefully there are no other bugs.
great news:), I am curious to know what was the "insider" fix you got ?
On Tue, Jun 2, 2015 at 11:43 AM, Justin notifications@github.com wrote:
ok, this time the solution works for sure! just going through the normal process of making sure i implemented it right everywhere. hopefully there are no other bugs.
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-107995308 .
she told me that vivado hls isn't as thoroughly tested as you might expect. directives do not necessarily work with each other, or at all for that matter. one of the things she told me to do is manually inline everything, which is what resolved the problem. by manually inline, i mean take the contents of all send/recv functions and put them into host. so the code is sort of a mess right now.
i wanna merge to master right now. may i do this? the host/device are definitely communicating right now. i can see it on the arm core. weird thing is it is either taking a long time or not converging (its been running for 7 minutes now). it still passes host_tb.cpp so we're no worse off than before if i do merge. its just that the code is kinda messy. it would be helpful if you could skim through if to see if you think its acceptable to merge to master.
ok.. so i think there's something wrong with the device after all. either that or im reading from it wrong or something. after initialize it, it should be e[i] = y[i] ? -1 : 1. but right now im seeing e[i] = 1. HLS is once again synthesizing something wrong cause host_tb still passes -.-. gah!!!!
this bug sounds easier to debug now though.
oh wait.. i think i know the problem. it's that bug that ibrahim found in common.cpp. i forgot to re-synth device since it uses that.
I think the device is correct, it passes co-sim.
On Tue, Jun 2, 2015 at 9:36 PM, Justin notifications@github.com wrote:
oh wait.. i think i know the problem. it's that bug that ibrahim found in common.cpp. i forgot to re-synth device since it uses that.
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-108153640 .
only with your change correct changing the uint32_t to int32_t in send() for floats in common.cpp?
yes
On Tue, Jun 2, 2015 at 9:51 PM, Justin notifications@github.com wrote:
only your change correct changing the uint32_t to int32_t in send() for floats in common.cpp?
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-108158515 .
i decided to re-implement my stuff. something i did on justin_branch makes it not pass c-sim with random vectors (note that vanilla master does). the fix you mentioned isnt on master so i just reapplied it locally and am compiling right now. i will verify tomorrow morning.
okay, should I add my fix to the master branch ? I guess I should right.
On Tue, Jun 2, 2015 at 10:04 PM, Justin notifications@github.com wrote:
i decided to re-implement my stuff. something i did on justin_branch makes it not pass c-sim with random vectors (note that vanilla master does). the fix you mentioned isnt on master so i just reapplied it locally and am compiling right now. i will verify tomorrow morning.
— Reply to this email directly or view it on GitHub https://github.com/taijusti/sleep_apnea/issues/21#issuecomment-108159875 .
ive added it to my local thing to try and get the hw working. if you'd like i can commit what i have. it's no worse than what is on github right now (i.e. host_tb passes)
going to port over host.cpp and all related files to ARM and snoop the FIFOs to see if they match up with what HW is doing.
this is confirmed to work on hardware, along with the device. full solution works for single device case!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
it's currently hanging on an if statement -.- gonna try to restructure the host code a little to reduce the control flow. maybe hls wont get confused then