totalspectrum / flexprop

Simple GUI for Propeller development (both P1 and P2)
Other
38 stars 15 forks source link

Compiled SPIN2 code, with infinite loop, runs once (correctly) then stops #77

Closed GitDave99 closed 1 year ago

GitDave99 commented 1 year ago

The compiled listing shows a jump from the end of the compiled SPIN2 code to the beginning as "jmp LR_0001"..

Naturally the code runs correctly in the PropTool environment.

By placing writes to pin 5 for tracing the code timing and progress through various sections, I determined that the code is not hanging up in any part of the code but the end, where it should loop to the top.

Which file is appropriate for illustrating the issue? SPIN2? Listing? I'm new at Flexprop, but I've been coding in SPIN1 and PASM1 since 2009, SPIN2 and PASM2 since 2021, other machine code for decades. This issue has me stopped! FLEXPROP ISSUE 2

Thanks for your help, Dave. FLEXPROP ISSUE 1.txt CODE FLOW BLOCK DIAGRAM FLEXPROP ISSUE 2

totalspectrum commented 1 year ago

The code you posted is only the main COG, so it's difficult to analyze for sure, but my guess would be that there's some kind of race condition where when compiled the code in one of the COGs is finishing too early and hence the main COG is hanging in a WAITATN() instead of jumping back to the top. The other possibility would be a stack overflow; again, I can't see how the other COGs are being started, but it's possible that the flexprop compiled code needs a bigger stack than the proptool compiled code. So those are the two things I would suggest investigating.

totalspectrum commented 1 year ago

If you can create a .zip file using FlexProp's Commands > Create Zip Archive menu item, that's often a convenient way of posting all of the source files for a project. Of course, that does depend on being able to distribute those files, which sometimes is a problem for commercial projects.

GitDave99 commented 1 year ago

Okay, about the zip feature.

I took out the cig instantiation and the cogatn/wait instructions. At that point it did loop. Then I included the ADC cog without the atn/wait and it did loop. Then I included the atn/wait, still looped.

Then I included the DBUG cog load, without the atn/wait.

Failed to loop. Ok, says I, let's go back to the beginning, no cogs loaded, no atn/waits.

FAILED TO LOOP! That's when my wife grabbed me to go out to eat.

I didn't see any place in the docs for the need to make stack allocations. That could be...

What do you recommend?

Thanks!

Dave

On Sat, Jul 8, 2023, 5:08 PM Eric R. Smith @.***> wrote:

If you can create a .zip file using FlexProp's Commands > Create Zip Archive menu item, that's often a convenient way of posting all of the source files for a project. Of course, that does depend on being able to distribute those files, which sometimes is a problem for commercial projects.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627550530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN7VWHFEI3JEF65EO3TXPHZBVANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

GitDave99 commented 1 year ago

It's my belief that the compiler should "know" how much stack it requires.

Looking again at the docs, "temporary stack allocation" is mentioned with caveats but no example of proper use.

The original PropTool code needs no stack.

If the compiled bare SPIN2 code, without the cogatn/wait instructions and no other cogs loaded, fails to loop (but does run ONCE properly) what can be wrong?

I'll zip the files and send.

On Sat, Jul 8, 2023, 6:12 PM Dave Conrad @.***> wrote:

Okay, about the zip feature.

I took out the cig instantiation and the cogatn/wait instructions. At that point it did loop. Then I included the ADC cog without the atn/wait and it did loop. Then I included the atn/wait, still looped.

Then I included the DBUG cog load, without the atn/wait.

Failed to loop. Ok, says I, let's go back to the beginning, no cogs loaded, no atn/waits.

FAILED TO LOOP! That's when my wife grabbed me to go out to eat.

I didn't see any place in the docs for the need to make stack allocations. That could be...

What do you recommend?

Thanks!

Dave

On Sat, Jul 8, 2023, 5:08 PM Eric R. Smith @.***> wrote:

If you can create a .zip file using FlexProp's Commands > Create Zip Archive menu item, that's often a convenient way of posting all of the source files for a project. Of course, that does depend on being able to distribute those files, which sometimes is a problem for commercial projects.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627550530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN7VWHFEI3JEF65EO3TXPHZBVANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

GitDave99 commented 1 year ago

Hi , Eric

Using "no optimization" I tried the process of addition again.

This time I found that the DBUG routine lacked a cogatn instruction to tell cog 0 it was done, thus stalling at that point.

Funny thing was earlier, with the waitatn removed , it still stalled but after a bit of code that cleared arrays to zero, not at the removed waitatn.

Seems the problem is resolved.

Thanks for your input! Dave

On Sat, Jul 8, 2023, 5:04 PM Eric R. Smith @.***> wrote:

The code you posted is only the main COG, so it's difficult to analyze for sure, but my guess would be that there's some kind of race condition where when compiled the code in one of the COGs is finishing too early and hence the main COG is hanging in a WAITATN() instead of jumping back to the top. The other possibility would be a stack overflow; again, I can't see how the other COGs are being started, but it's possible that the flexprop compiled code needs a bigger stack than the proptool compiled code. So those are the two things I would suggest investigating.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627549453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN3AABZ5MT2FU4KHWCTXPHYRZANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

GitDave99 commented 1 year ago

Fails to loop using default optimization, does loop with no optimization.

Compilers...

On Sat, Jul 8, 2023, 5:04 PM Eric R. Smith @.***> wrote:

The code you posted is only the main COG, so it's difficult to analyze for sure, but my guess would be that there's some kind of race condition where when compiled the code in one of the COGs is finishing too early and hence the main COG is hanging in a WAITATN() instead of jumping back to the top. The other possibility would be a stack overflow; again, I can't see how the other COGs are being started, but it's possible that the flexprop compiled code needs a bigger stack than the proptool compiled code. So those are the two things I would suggest investigating.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627549453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN3AABZ5MT2FU4KHWCTXPHYRZANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

GitDave99 commented 1 year ago

Hi Eric,

There were two issues causing the lack of looping: 1) if optimization of any kind was selected, the compiled code (even with all cogatn/waits removed) would not loop 2) selecting NO optimization would loop with no cogwait/atn

I found that there was a missing cogatn in the DBUG assy code, my bad!

Now, that WORKED in PropTool environment, evidently the interpreted code includes the timeout function for that instruction. But evidently the FlexSpin compiler does not include that time out function, so code that worked in PropTool stalled there in FlexSpin.

So, why does the use of optimization of any kind, cause my code to fail looping??

Attached find the entire code files.

Do you want to include the time-out for waitatn? Or make it an option? Is there a way to determine why my code fails under optimization options? Higher optimization appears to eliminate the space for the arrays.... ???

Best regards, Dave

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Sat, Jul 8, 2023 at 5:04 PM Eric R. Smith @.***> wrote:

The code you posted is only the main COG, so it's difficult to analyze for sure, but my guess would be that there's some kind of race condition where when compiled the code in one of the COGs is finishing too early and hence the main COG is hanging in a WAITATN() instead of jumping back to the top. The other possibility would be a stack overflow; again, I can't see how the other COGs are being started, but it's possible that the flexprop compiled code needs a bigger stack than the proptool compiled code. So those are the two things I would suggest investigating.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627549453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN3AABZ5MT2FU4KHWCTXPHYRZANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

totalspectrum commented 1 year ago

WAITATN isn't documented to have any timeout. My guess is that there's a timing issue and the COG is sending the ATN signal too early or late in the flexprop version. The default Spin2 interpreter used by PropTool is quite a bit slower than the FlexProp compiled code, so the timing of the Spin code will be very different in the two environments.

I don't see the complete source code (did you forget to attach it?), so I can't be sure, but timing issues are a common cause of incompatibilities between the compilers. You could try adding some delays in the Spin code to see if this changes the behavior.

In general WAITATN is a bit of a crude method for synchronization, particularly if there are multiple COGs involved. Have you considered using some shared memory as a semaphore or something similar?

GitDave99 commented 1 year ago

Hi Eric, I thought you would do a compile on it to see if you got the same issue with using optimization.

My first take on it used pins as semaphores. That worked on PROPTOOL environment, failed in FlexProp because I used optimization.

The PASM2 docs mention the timeout option for the waitatn in PASM2. I'm guessing maybe SPIN2 includes that in the interpreter, else my SPIN2 code would have hung on WAITATN() when the atn instruction was missing from the DBUG assy code.

Right now I'm writing some SPIN2 code to finalize the display of targets in a radar-like display screen. Actually more like a fish finder.

If you need the compiled code I can send it a bit later.

Stay well Dave

On Sun, Jul 9, 2023, 5:12 PM Eric R. Smith @.***> wrote:

WAITATN isn't documented to have any timeout. My guess is that there's a timing issue and the COG is sending the ATN signal too early or late in the flexprop version. The default Spin2 interpreter used by PropTool is quite a bit slower than the FlexProp compiled code, so the timing of the Spin code will be very different in the two environments.

I don't see the complete source code (did you forget to attach it?), so I can't be sure, but timing issues are a common cause of incompatibilities between the compilers. You could try adding some delays in the Spin code to see if this changes the behavior.

In general WAITATN is a bit of a crude method for synchronization, particularly if there are multiple COGs involved. Have you considered using some shared memory as a semaphore or something similar?

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1627872367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN7Z62K3AZ37ISEP7R3XPNCILANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

totalspectrum commented 1 year ago

I can't do a compile because I don't have the complete source code. Would you be able to upload it? Although without your hardware I may not be able to run it.

The PASM WAITATN instruction has a timeout option, but the Spin2 WAITATN() function does not (and the Spin2 interpreter source code actually shows that there's no timeout there either), Strangely the interpreter does a loop with POLLATN rather than using the WAITATN instruction. I've changed flexspin to use this method just in case there's some subtle difference.

GitDave99 commented 1 year ago

Hi Eric,

You DO have the entire source code, there are only 3 files, top and two cog loads of assy code. Yeah, the top object does have a lot of commented out stuff that would be confusing but there are indeed only 3 files at present.

The code does not rely on any hardware other than a P2 to run. The ADCs will just read around 128 with open pins. The DBUG output should run without any connection to a serial device, no handshaking used. Just the TX pin and checking for a buffer flush. The main loop will just chew on the static data from the unconnected ADC pins.

It's nice to have access to pin 5 with scope or logic analyzer to see the procession of the code through the various parts for timing and stalls, if any.

Which compiled file do you need? LST or other?

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Mon, Jul 10, 2023 at 4:19 AM Eric R. Smith @.***> wrote:

I can't do a compile because I don't have the complete source code. Would you be able to upload it? Although without your hardware I may not be able to run it.

The PASM WAITATN instruction has a timeout option, but the Spin2 WAITATN() function does not (and the Spin2 interpreter source code actually shows that there's no timeout there either), Strangely the interpreter does a loop with POLLATN rather than using the WAITATN instruction. I've changed flexspin to use this method just in case there's some subtle difference.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1628742848, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN6D4PVU4VJEFBBZBD3XPPQNDANCNFSM6AAAAAA2C7C7UI . You are receiving this because you authored the thread.Message ID: @.***>

Wuerfel21 commented 1 year ago

Attaching files to issues through the email gateway does not work, use the actual web interface to upload.

GitDave99 commented 1 year ago

TOP OBJECT RADAR EDGE Flex .zip that should do it. Let me know if you need more.

totalspectrum commented 1 year ago

Thank you, @GitDave99 . I'm able to build everything now, and reproduce the bug. It's a real puzzler. I've put debug code into the generated assembly, and the code is reaching the branch at the end of the loop. That branch seems to still be intact (the code in memory is the correct sequence of bytes). And yet the branch seems to be going to the wrong place. None of this really makes sense. I strongly suspected the other COGs, but disabling them doesn't help. Changing the code size does change the behavior (which is why, I think, turning off optimization makes the problem go away) which seems to imply memory corruption, but I used a DEBUG to dump the jump instruction just before taking it and it's still intact. None of this makes sense, but I'll keep looking.

GitDave99 commented 1 year ago

Great! That's progress, after all.

Best to test, like I did, with no atn/wait and no other cogs loaded. I did that way for absolute isolation to the looping issue. Best for peace of mind, too, LOLOLOL!

On Mon, Jul 10, 2023, 6:00 PM Eric R. Smith @.***> wrote:

Thank you, @GitDave99 https://github.com/GitDave99 . I'm able to build everything now, and reproduce the bug. It's a real puzzler. I've put debug code into the generated assembly, and the code is reaching the branch at the end of the loop. That branch seems to still be intact (the code in memory is the correct sequence of bytes). And yet the branch seems to be going to the wrong place. None of this really makes sense. I strongly suspected the other COGs, but disabling them doesn't help. Changing the code size does change the behavior (which is why, I think, turning off optimization makes the problem go away) which seems to imply memory corruption, but I used a DEBUG to dump the jump instruction just before taking it and it's still intact. None of this makes sense, but I'll keep looking.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1629938511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN2M7IEDCKYZDE7G6NTXPSQTTANCNFSM6AAAAAA2C7C7UI . You are receiving this because you were mentioned.Message ID: @.***>

GitDave99 commented 1 year ago

The non-looping might be sensitive to array size .

The structure as a series of loops in a loop might be straining some inbuilt constraints or assumptions in the compiler... maaaaybe!

On Mon, Jul 10, 2023, 6:44 PM Dave Conrad @.***> wrote:

Great! That's progress, after all.

Best to test, like I did, with no atn/wait and no other cogs loaded. I did that way for absolute isolation to the looping issue. Best for peace of mind, too, LOLOLOL!

On Mon, Jul 10, 2023, 6:00 PM Eric R. Smith @.***> wrote:

Thank you, @GitDave99 https://github.com/GitDave99 . I'm able to build everything now, and reproduce the bug. It's a real puzzler. I've put debug code into the generated assembly, and the code is reaching the branch at the end of the loop. That branch seems to still be intact (the code in memory is the correct sequence of bytes). And yet the branch seems to be going to the wrong place. None of this really makes sense. I strongly suspected the other COGs, but disabling them doesn't help. Changing the code size does change the behavior (which is why, I think, turning off optimization makes the problem go away) which seems to imply memory corruption, but I used a DEBUG to dump the jump instruction just before taking it and it's still intact. None of this makes sense, but I'll keep looking.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1629938511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN2M7IEDCKYZDE7G6NTXPSQTTANCNFSM6AAAAAA2C7C7UI . You are receiving this because you were mentioned.Message ID: @.***>

totalspectrum commented 1 year ago

I think I found the bug -- the loop size calculation code (for determining whether loops could be moved into COG memory) wasn't taking into account instructions that need AUGS or AUGD prefixes. This meant that large loops in a program with large arrays could sometimes overflow the COG memory needed. This is fixed in 6.2.1.

Thanks for the bug report!

GitDave99 commented 1 year ago

Marvelous! Glad I could help. So, when may I download the improved version?

Oh, what about the optimization removing or eliminating the arrays? Just the highest optimizations.

On Wed, Jul 12, 2023, 3:59 PM Eric R. Smith @.***> wrote:

I think I found the bug -- the loop size calculation code (for determining whether loops could be moved into COG memory) wasn't taking into account instructions that need AUGS or AUGD prefixes. This meant that large loops in a program with large arrays could sometimes overflow the COG memory needed. This is fixed in 6.2.1.

Thanks for the bug report!

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1633311734, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPN2Q4DM3ASFJUVTF543XP4T5DANCNFSM6AAAAAA2C7C7UI . You are receiving this because you were mentioned.Message ID: @.***>

totalspectrum commented 1 year ago

The new version is up on github and on my Patreon page.

The highest optimization level doesn't really remove the arrays, it just removes the pre-initialized data in the arrays (replacing them with an initialization loop) so that the binary gets smaller.

GitDave99 commented 1 year ago

Thanks, I'll download soon.

Would the optimizer output be easier to interpret if it were consistent, breaking out the code space and array/var space for each option? Seeing the size of the arrays disappear was surprising.

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Wed, Jul 12, 2023 at 6:48 PM Eric R. Smith @.***> wrote:

The new version is up on github and on my Patreon page.

The highest optimization level doesn't really remove the arrays, it just removes the pre-initialized data in the arrays (replacing them with an initialization loop) so that the binary gets smaller.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1633416797, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPNZR6RKCK73UOIBUYF3XP5HW5ANCNFSM6AAAAAA2C7C7UI . You are receiving this because you were mentioned.Message ID: @.***>

GitDave99 commented 1 year ago

The 6.2.1 compiler fouls up the ADC operation, causing large noise from 0 to ~50 (out of 255). Icompiled it this morning on 6.1.8, no noise. Deleted 6.1.8, installed 6.2.1, compiled same code, big disappointment.

Too bad, any ideas?

Best regards, Dave

On Wed, Jul 12, 2023, 6:48 PM Eric R. Smith @.***> wrote:

The new version is up on github and on my Patreon page.

The highest optimization level doesn't really remove the arrays, it just removes the pre-initialized data in the arrays (replacing them with an initialization loop) so that the binary gets smaller.

— Reply to this email directly, view it on GitHub https://github.com/totalspectrum/flexprop/issues/77#issuecomment-1633416797, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUWJPNZR6RKCK73UOIBUYF3XP5HW5ANCNFSM6AAAAAA2C7C7UI . You are receiving this because you were mentioned.Message ID: @.***>