serial data corruption at 460800

bmentink commented 7 years ago

I am sending the following sequence from Forth:

$9a emit 36 emit $80 emit

in a loop, and I am seeing:

0000000 c000 0000 0000 0000 00c0 0000 0000 c000
0000010 0000 0000 0000 00c0 0000 0000 c000 0000
0000020 0000 0000 00c0 0000 0000 c000 0000 0000
0000030 0000 00c0 0000 0000 c000 0000 0000 0000
0000040 00c0 0000 0000 c000 0000 0000 0000 00c0

directly out of port /dev/ttyUSB1, either by cat'ing to a file and dumping it, or some terminal program .. However, sometimes it is ok ... and I get the right data.

Is there some problem with the timing at 460800 baud? It only seems a problem after sending an 8-bit value ..

bmentink commented 7 years ago

My bad, the sequence should have been send as $9a 36 80

RGD2 commented 7 years ago

I'm seeing something similar - sometimes a line like 400 . Won't return 400 ok.

bmentink commented 7 years ago

Interesting, the 1st sequence pushed the incorrect data, the 2nd corrected sequence with decimal 80 instead of hex, produced the correct data .... is there a parity/8-bit issue?

EDIT: Nope it's just random .... happened again with both lot's of data ..

bmentink commented 7 years ago

More updates on this issue:

I have noticed that when I program the board including some forth which loads and runs on boot, it outputs the correct data while power is on, but then if I unplug it and plug back in a few times, that eventually it will start outputting the wrong data. Once in that state, it never sends correct data any more.

If I re-flash the board, that does not help. However, If I re-build the complete image including the Forth, I can then get it going again, .... the cycle is repeated ....

This is a big worry for me, as I am totally relying on correct serial data for my project .... this is a killer ... EDIT: I will code some serial out directly in verilog to see if the issue is on the Forth side or not.

bmentink commented 7 years ago

I was looking through how the uart.v module is used and noticed it is never reset, is this the issue?

instead of

buart _uart0 (
     .clk(clk),
     .resetq(1'b1),
     .rx(uart_RXD),
     .tx(TXD),
     .rd(uart0_rd),
     .wr(uart0_wr),
     .valid(uart0_valid),
     .busy(uart0_busy),
     .tx_data(dout_[7:0]),
     .rx_data(uart0_data));

Should it not be:

buart _uart0 (
     .clk(clk),
     .resetq(resetq),
     .rx(uart_RXD),
     .tx(TXD),
     .rd(uart0_rd),
     .wr(uart0_wr),
     .valid(uart0_valid),
     .busy(uart0_busy),
     .tx_data(dout_[7:0]),
     .rx_data(uart0_data));

RGD2 commented 7 years ago

It shouldn't matter... Read up on resets in SRAM based FPGA's: there's a good argument that instead of having designed-in resets, one ought to just do a full FPGA reconfigure.

(Which can be done in the j1a/j4a with '4 $800 io!', which triggers the warmboot module to go do that - good if you want to fully reset from cold like state, including going back to the initial ram state, compared to CTRL-C, which does a "warm" j1a reboot and leaves the ram alone. )

I'm currently looking at this issue because '5000 dup .x .' is sometimes giving me many numbers other than 5000... And I suspect it's in the rs232 link, since sometimes uploading code with #include also fails for no apparent reason.

For building deployed apps, I've taken to using verilator (make sim_connect) to '#include' everything before the '#flash ../build/nuc.hex', which sidesteps the issue for that.

It never seems to refuse textual input, so I wonder if it isn't a timing issue with the serial line... My current problem might also be unrelated.

Do what you can to collect data, particularly the specifically broken stuff.

bmentink commented 7 years ago

Well it "will" matter because it means that there is a chunk of code in uart.v that will never execute (code responsible for setting variables in a known state).

What's the point of having code to respond to a reset in the module if it doesn't get called .. .... I don't get ya ..

On Fri, Nov 4, 2016 at 2:52 PM, RGD2 notifications@github.com wrote:

It shouldn't matter... Read up on resets in SRAM based FPGA's: there's a good argument that instead of having designed-in resets, one ought to just do a full FPGA reconfigure.

(Which can be done in the j1a/j4a with '4 $800 io!', which triggers the warmboot module to go do that - good if you want to fully reset from cold like state, including going back to the initial ram state, compared to CTRL-C, which does a "warm" j1a reboot and leaves the ram alone. )

I'm currently looking at this issue because '5000 dup .x .' is sometimes giving me many numbers other than 5000... And I suspect it's in the rs232 link, since sometimes uploading code with #include also fails for no apparent reason.

For building deployed apps, I've taken to using verilator (make sim_connect) to '#include' everything before the '#flash ../build/nuc.hex', which sidesteps the issue for that.

It never seems to refuse textual input, so I wonder if it isn't a timing issue with the serial line... My current problem might also be unrelated.

Do what you can to collect data, particularly the specifically broken stuff.

On Friday, 4 November 2016, bmentink notifications@github.com wrote:

I was looking through how the uart.v module is used and noticed it is never reset, is this the issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/ 45#issuecomment-258288227, or mute the thread https://github.com/notifications/unsubscribe- auth/AO8-GLqHXU0Z1IULoHMyRQm-IanJPwwaks5q6lqFgaJpZM4Ki80M

.

-- Remy

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-258325419, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6hy624Ua6ig7yXosNr6NxV_t-DUS1ks5q6o_TgaJpZM4Ki80M .

bmentink commented 7 years ago

Update: Adding the resetq to the top level of the uart module instance did not help in this case.

However, there is an issue with the uart.v code. I dumped it out and replaced it with a hacked version of an opencore verilog uart module (I had to hack the interface to suit ..) ..

That is working great, I have had my serial test code running for 30 mins and unplugged/plugged in at least 20 times at the moment without failure (with the old uart.v, it would fail after cycling the power only 5 times). I will keep testing ...

The opencore module has better timing, as it over-samples by 16 .. which at 460,800 is probably important ..

EDIT: Still working 6 hrs later, looks like the issue is fixed. If anyone wants I can post the code.

RGD2 commented 7 years ago

Reasons to reset are things like glitches or soft errors caused by ionising radiation. But if your design needs a reset, the whole FPGA probably also needs a reset, because it might well be mis-configured.

It's possible to define initial values in verilog besides using a reset, and doing so results in registers being properly initialised at FPGA configure time. IMHO ice40 arch doesn't support registers starting with other than a '0', so yosys does it by inserting not gates either side of the register which is to initialise with a '1'.

uart.v, it turns out, does use initialisation values as well as supporting the reset input, so either way it will work.

All of the above is a bit beside the point here.

If the system clock isn't 12 MHz and the baud rate isn't 115200, then both CLKFREQ and BAUD constants at the top of uart.v need to be changed, or else the sampling will be way off if the clock is changed in j1a.v

A useful patch would be to make these two into parameters, so they can change when the uart is instantiated in j1a.v, so that changed to the clock rate there also change the parameter the uarts clock divider needs.

Look at stack2.v and how the DEPTH parameter is used when it's instantiated in j1.v for how this would look.

bmentink commented 7 years ago

Remy, you are missing the point. Sure changing the constants in top of uart.v need to be changed for different baud rates, not entirely without a brain cell. (12Mhz & 115200 results in exactly the same prescale value as 48Mhz & 460800 (x4)...duh!)

The point is that there is a timing error in uart.v and/or an initializing error. The opencore uart.v works correctly, the one in swapforth does not ..... end of story.

Either the uart.v in swapforth needs to be debugged and fixed, or the "better" uart.v from opencore included in it's place, as I have successfully done.

As I have stated, I have modified the opencore uart for swapforth so it is a direct replacement without having to change the top level... you seem to be not even interested in trying it to see if your errors go away ... which is surprising ....

jamesbowman commented 7 years ago

Thanks @bmentink for your efforts -- please can you supply the better UART as a patch?

Please restrict the subject matter to the project itself. Criticizing the uart is great. Criticizing people, not great. Thanks.

bmentink commented 7 years ago

@jamesbowman

I can supply it as a patch, but I won't have git "push" permissions will I ?? Or, I could just attach to this post, as it is not big ..

RGD2 commented 7 years ago

@bmentink My apologies bmentink. I would indeed like to try the opencore uart.

github is a little odd in that submitting patches is somewhat complicated, at least to set up the very first time for newcomers -- but this is at least partially git's 'fault'. You don't need git push permissions to Jame's repo.

In apology, I'll detail the steps here:

'fork' the project to your own github account. https://github.com/bmentink/swapforth will exist (welcome to the club!)
Add your github account's fork as an additional remote to the repo you are working with on your own PC:
- git remote add gh https://github.com/bmentink/swapforth from within your swapforth directory should do it. (replace 'gh' with any name you like - it's just how the 'remote' is named in your local repo.)
  - (git remote -v will list the remotes you have - you'll usually have origin as the place you git clone 'd from at first, if that repo wasn't started using git init).
- you can pull from it with git pull gh master and so on, as opposed to git pull origin master which is what git pull is shothand for.
- you can even rename Jame's version as upstream and make yours origin so later you can use the shorter command.
Do the usual things (on your PC):
1. Be up to date with upstream master
2. Make yourself a new branch, and commit your changes to it (git checkout -b openuart, then git add and commit until you're happy with it).
Push the new branch back up to your github account fork with git push gh openuart (it should ask for your github account credentials - there's a whole 'nother thing about setting up key files so it doesn't need to ask.)
Sign into your github account, submit the request from there to pull the openuart branch back into Jame's master.
- He can then review it, etc etc.
- Others can pull your new branch straight to their repos' to try it out as well

I hope that helps.

Yes, it's a bit long winded. Later you repeat just steps 3 to 5 for each new suggested patch.

I'd be grateful just to see the new openuart.v attached here in a zipfile.

bmentink commented 7 years ago

Hi Remy & James,

I also apologize if I caused any offence ... was a bit frustrated :)

Attached is the zip file containing a single file uart2.v ... Please try it out, and let me know how it works for you ..

Sorry havn't got the time at present to do all the git stuff ... will get onto it later.

Cheers, Bernie

On Sun, Nov 6, 2016 at 7:27 PM, RGD2 notifications@github.com wrote:

@bmentink https://github.com/bmentink My apologies bmentink. I would indeed like to try the opencore uart.

github is a little odd in that submitting patches is somewhat complicated, at least to set up the very first time for newcomers -- but this is at least partially git's 'fault'. You don't need git push permissions to Jame's repo.

In apology, I'll detail the steps here:
'fork' the project to your own github account. https://github.com/bmentink/swapforth https://github.com/bmentink/swapforth will exist (welcome to the club!)
Add your github account's fork as an additional remote to the repo you are working with on your own PC:

git remote add gh https://github.com/bmentink/swapforth from within your swapforth directory should do it. (replace 'gh' with any name you like - it's just how the 'remote' is named in your local repo.)
- (git remote -v will list the remotes you have - you'll usually
have origin as the place you git clone 'd from at first, if that
repo wasn't started using git init).
you can pull from it with git pull gh master and so on, as opposed to git pull origin master which is what git pull is shothand for.

you can even rename Jame's version as upstream and make yours origin so later you can use the shorter command.
Do the usual things (on your PC):

Be up to date with upstream master

Make yourself a new branch, and commit your changes to it (git checkout -b openuart, then git add and commit until you're happy with it).

Push the new branch back up to your github account fork with git push gh openuart (it should ask for your github account credentials - there's a whole 'nother thing about setting up key files so it doesn't need to ask.)

Sign into your github account, submit the request from there to pull the openuart branch back into Jame's master.

He can then review it, etc etc.

Others can pull your new branch straight to their repos' to try it out as well
I hope that helps.

Yes, it's a bit long winded. Later you repeat just steps 3 to 5 for each new suggested patch.

I'd be grateful just to see the new openuart.v attached here in a zipfile.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-258663217, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6h50m7QLGJ28EQW5_OJHUPd8KfWvFks5q7XM_gaJpZM4Ki80M .

RGD2 commented 7 years ago

Ok Bernie - I can't see the attachment though, neither through my email copy of this thread, nor at https://github.com/jamesbowman/swapforth/issues/45

bmentink commented 7 years ago

Oh, I attached it in an email reply .... nevermind, you can download it from here:

https://www.edaplayground.com/x/4aed

bmentink commented 7 years ago

How did you get on? Fix your issue @RGD2 ?

RGD2 commented 7 years ago

Hmm... Very odd. It works fine with the j1a @ 48MHz on the icestick

But with the j4a @ 48MHz on the ice40hx8k breakout board drops characters extremely often. I.e, within 5 characters. Lucky if words doesn't drop out, although the connection reestablishes after CTRL-C (which does a soft-reset). This is tested just on a spare breakout board, not on the 'production' embedded one.

I've added it as a branch hanging off the end of my current work - https://github.com/RGD2/swapforth/tree/uart2test

RGD2 commented 7 years ago

( At some point, I will go through and rebase/squash/separate/clean a lot of the commits on my j4a-pmod branch... it's gotten to be a bit of a fork, and not everything ought to go back into master. )

I'll make a branch rebased off master to pull in with uart2.v later, if someone else doesn't beat me to it. The changes for uart2.v are very trivial - just one character in icestorm/Makefile, apart from the change to the PLL settings.

bmentink commented 7 years ago

I have been testing with j1a8k on a ice40hx8k breakout board ... all good, no character drops.

With the j4a doesn't each task run at an effective 12Mhz? Won't that be an issue?

RGD2 commented 7 years ago

Yes, and no. The core actually runs to the same clock internally, there's no division, and it doesn't actually have four j1's in it. Only four stacks.

My issue could be due to the IO subsystem timing though - it could be the io bit-masking subsystem I added biting me. Hmm.

If that's it then I can avoid it by pipelining IO reads - possible with the j4a, but not the J1.

Thanks - I'll chase that up when I have a moment.

bmentink commented 7 years ago

Notice an interesting/annoying aspect of serial with the hx8k breakout board.

I have my Drum trigger project running well now and it sends midi commands over serial at the 460k rate just fine.

However, if I turn my Laptop off and restart it, as apposed to just suspending it, I loose communication with the board. I either get no response from the board or rubbish characters ..

The only way I can restore operation, is to do a complete build again including the Forth code. If I just try to Flash the bin file, that does not restore the board ..

I am wondering if it is an issue with the Reset line? I am clearing DTR in the software that I use to receive the serial midi commands .. do I need to do anything with RTS?

Looking at the Schematic for the board, I can't even see where those lines are even used for that aux serial port ..

Any idea's

RGD2 commented 7 years ago

No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX and TX. Perhaps our issues are to do with the FTDI chip??

The 'production' j4a I've been using still suffers from the issue, but it seems only to affect the boards reception of some numbers. Ie, if I need to put 5500 on the stack, I'll append dup . to see if it got there ok, and if not (as quite often happens) I'll hit the up arrow, and change the line to drop 5500 dup . and keep repeating that until 5500 comes back.

It's not transmission to the PC that seems to be the problem in my case: because once the wrong (or right) value is echoed, I never get a different answer just repeating dup . And the issue doesn't seem to bother text - otherwise it would have spat the dummy at words it should know. Although... Forth is case insensitive, so if the bit that encodes case in ascii is flipped, it wouldn't necessarily complain. But I'm still not sure how it gets some of the "wrong" values I've been seeing.

This workaround is good enough to get me by: all my recorded data streams through a different board which uses an FX2 USB chip. And I've collected more than 2.5 TB through that over the last three months without much trouble. The only data corruption there seems to be due to noise in the SPI lines into the other FPGA, and I was able to change the wiring to eliminate it.

But FTDI have gotten in trouble recently for doing nasty things if their driver thinks it's taking to a forged chip, maybe it's that?

RGD2 commented 7 years ago

I've just been reading back over this issue this morning (and cleaning out the copied comments the email responses left).

Between then and now, I'd been reading up on the ice40. The behaviour you get -- about it not working until after a recompile - IS INDEED consistent with some part of the ice40's configuration ram going un-set.

The ice40 documentation reveals that, at least for the BRAM, and most likely for all configuration cells, which are SRAM: If not specifically set during configuration, then the previous data stays put.

I wonder if you left it powered off for at least a couple minutes before turning it on again, when you found that unplugging it didn't clear the problem? If you cut power for a few seconds only, the un-configured sram state causing the issue could well have kept its value, if the bitfile being loaded from the eeprom on boot never specifically set it.

This is also why reconfiguration with the same bitfile didn't help - copying it from the PC to the on-board eeprom chip changed nothing.

But recompiling and reconfiguring did help: Every time you recompile, the whole design ends up placed in essentially a different random way, so even if the same cells were left alone, a different set was taking over. At least until the design stuffed itself up somehow.

If so, this does appear to be an issue not necessarily with swapforth, but possibly with the icestorm / arachne-pnr / yosys toolchain. First thing to do then would be to fully update all of those to their latest github revisions, and see if this bug still bites.

We have had similar bugs 'go away' with updates to the toolchain, it is still young. However, it's still the best game in town.

bmentink commented 7 years ago

Hi James,

No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX >and TX. Perhaps our issues are to do with the FTDI chip??

Ok, then why does shell.py set & clear these lines then? That is confusing.

Regarding FTDI driver, I will upgrade/downgrade it to see if that is the issue ..

Seems strange it only happens on cold boot of my Laptop .. and .. that I have to re-program the FPGA to clear the issue .. how can that be a driver issue? ..

Cheers, Bernie

On Thu, Dec 15, 2016 at 10:10 AM, RGD2 notifications@github.com wrote:

No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX and TX. Perhaps our issues are to do with the FTDI chip??

The 'production' j4a I've been using still suffers from the issue, but it seems only to affect the boards reception of some numbers. Ie, if I need to put 5500 on the stack, I'll append dup . to see if it got there ok, and if not (as quite often happens) I'll hit the up arrow, and change the line to drop 5500 dup . and keep repeating that until 5500 comes back.

It's not transmission to the PC that seems to be the problem in my case: because once the wrong (or right) value is echoed, I never get a different answer just repeating dup . And the issue doesn't seem to bother text - otherwise it would have spat the dummy at words it should know. Although... Forth is case insensitive, so if the bit that encodes case in ascii is flipped, it wouldn't necessarily complain. But I'm still not sure how it gets some of the "wrong" values I've been seeing.

This workaround is good enough to get me by: all my recorded data streams through a different board which uses an FX2 USB chip. And I've collected more than 2.5 TB through that over the last three months without much trouble. The only data corruption there seems to be due to noise in the SPI lines into the other FPGA, and I was able to change the wiring to eliminate it.

But FTDI have gotten in trouble recently for doing nasty things if their driver thinks it's taking to a forged chip, maybe it's that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-267157487, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6hz5Pf3sJ5xHFh1axJylqUyZnyLQAks5rIFswgaJpZM4Ki80M .

bmentink commented 7 years ago

But recompiling and reconfiguring did help: Every time you recompile, the whole design ends up placed in essentially a different random way, so even if the same cells were left alone, a different set was taking over. At least until the design stuffed itself up somehow.

Now that makes sense ... good thought .. I will update the tools again ..

Thanks

On Thu, Dec 15, 2016 at 11:48 AM, Bernard Mentink bmentink@gmail.com wrote:

Hi James,

No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX >and TX. Perhaps our issues are to do with the FTDI chip??

Ok, then why does shell.py set & clear these lines then? That is confusing.

Regarding FTDI driver, I will upgrade/downgrade it to see if that is the issue ..

Seems strange it only happens on cold boot of my Laptop .. and .. that I have to re-program the FPGA to clear the issue .. how can that be a driver issue? ..

Cheers, Bernie

On Thu, Dec 15, 2016 at 10:10 AM, RGD2 notifications@github.com wrote:

No, because you don't need either DTR or RTS, or any other auxiliary control line: just RX and TX. Perhaps our issues are to do with the FTDI chip??

The 'production' j4a I've been using still suffers from the issue, but it seems only to affect the boards reception of some numbers. Ie, if I need to put 5500 on the stack, I'll append dup . to see if it got there ok, and if not (as quite often happens) I'll hit the up arrow, and change the line to drop 5500 dup . and keep repeating that until 5500 comes back.

It's not transmission to the PC that seems to be the problem in my case: because once the wrong (or right) value is echoed, I never get a different answer just repeating dup . And the issue doesn't seem to bother text - otherwise it would have spat the dummy at words it should know. Although... Forth is case insensitive, so if the bit that encodes case in ascii is flipped, it wouldn't necessarily complain. But I'm still not sure how it gets some of the "wrong" values I've been seeing.

This workaround is good enough to get me by: all my recorded data streams through a different board which uses an FX2 USB chip. And I've collected more than 2.5 TB through that over the last three months without much trouble. The only data corruption there seems to be due to noise in the SPI lines into the other FPGA, and I was able to change the wiring to eliminate it.

But FTDI have gotten in trouble recently for doing nasty things if their driver thinks it's taking to a forged chip, maybe it's that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-267157487, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6hz5Pf3sJ5xHFh1axJylqUyZnyLQAks5rIFswgaJpZM4Ki80M .

RGD2 commented 7 years ago

DTR is (ab?)used to control the j1a/j4a reset signal.

In that context, it's a single-bit 'gpio' type thing that just happens to be available for use that way on that serial interface - it's not required to send data through the serial port, and isn't really being used as one would a modem.

When you do a 'CTRL-C' into shell.py it sends a reset signal, so if you accidentally lock up your swapforth machine, you can recover without wiping out the memory - it just does a 'soft' reset of the core. This is also why the tricks involving the init word work: You can leave it set up so that after a cold boot, it runs an app, but after a reset, it runs quit which is actually the CLI instead.

This lets you connect to a 'deployed' app on a j1a, interrupting it so you can still add/change the code.

I have had multiple shell.py's connected to the same j4a at once - and it even works fine, so long as each burst of IO happens at different times. (it was accidental, I left screen running an instance, and then found it later...). So, at least for the j4a, (probably because of the breakout board) the reset isn't always sent at connection time? (or else I left it set up to reset 'thread0' only, which is the other possibility).

But you can always force a soft reset with CTRL-C, and there's a way to force a 'harder reset' which involves the actual FPGA resetting and reconfiguring itself like a cold boot, as well. (using the 'warmboot') functionality. This can even be used to swap between FPGA images - one can have more than one in the bitfile.

I use this when developing with bigger programs from #include'd files, so I can return to a clean swapforth with no space used between compilations, prior to #flashing the ram contents so they can be baked into the FPGA image.

bmentink commented 7 years ago

DTR is (ab?)used to control the j1a/j4a reset signal.

In that context, it's a single-bit 'gpio' type thing that just happens to be available for use that way on that serial interface - it's not required to send data through the serial port, and isn't really being used as one would a modem.

It is required in the sense that if reset is enabled by DTR, not much serial action is going to go on is it?, because the j1a is in reset ...

Which is why I made sure DTR was cleared by my program on the Laptop that talks out the serial port, as it seemed to come up enabled by default (high).

Cheers

RGD2 commented 7 years ago

Here's an actual little 'conversation' I had recently with the deployed j4a, in the middle of an experimental run.

I wanted '1000' on the stack, because I was about to use it to set a certain variable... (This is an old version, on the old toolchain, not changed since October 21, and it's still actually running right now. It's been rock solid other than it's propensity to mis-hear me regarding numbers...).

>1000 dup .                                                                                                                        
 3300  ok

<sigh>... A whole bunch of 'up-arrow, enter' follows:

>. 1000 dup .                                                                                                                      
 3300 3300  ok                                                                                                                     
>. 1000 dup .                                                                                                                      
 3300 10890  ok                                                                                                                    
>. 1000 dup .                                                                                                                      
 10890 3300  ok                                                                                                                    
>. 1000 dup .                                                                                                                      
 3300 3300  ok                                                                                                                     
>. 1000 dup .                                                                                                                      
 3300 3300  ok                                                                                                                     
>. 1000 dup .                                                                                                                      
 3300 10913  ok                                                                                                                    
>. 1000 dup .                                                                                                                      
 10913 10890  ok                                                                                                                   
>. 1000 dup .                                                                                                                      
 10890 3300  ok                                                                                                                    
>. 1000 dup .                                                                                                                      
 3300 5002  ok                                                                                                                     
>. 1000 dup .                                                                                                                      
 5002 1000  ok

... finally!

RGD2 commented 7 years ago

I was just now able to reproduce the above on a different dev board -- using an application-specific image.

And my issue isn't the serial port: It does seem to be a j4a bug.

Which means I haven't been seeing your bug at all. (I don't think).

Sorry!

But on the other hand, if the updated version fixes it for you, you might close this bug out.

bmentink commented 7 years ago

I have updated both my OS (to get an updated FTDI lib) and the synth tools, so far I havn't had the issue, but will keep testing awhile before I close this issue ...

On Mon, Dec 19, 2016 at 6:03 PM, RGD2 notifications@github.com wrote:

I was just now able to reproduce the above on a different dev board -- using an application-specific image.

And my issue isn't the serial port: It does seem to be a j4a bug.

Which means I haven't been seeing your bug at all. (I don't think).

Sorry!

But on the other hand, if the updated version fixes it for you, you might close this bug out.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-267882689, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6h4Opx7b1Etf8zNlJD_320ONuMUuFks5rJhAVgaJpZM4Ki80M .

RGD2 commented 7 years ago

Hitting this one up again -- this time because I have an application which needs a secondary serial line to talk to some old lab equipment at 38,400. Doing the obvious - (setting uart.v to run that fast, and instantiating it twice in the top module) resulted in good transmission, but very badly broken reception. I'm using a MAX3232PMB1 module to convert the 3.3V logic levels to true RS232, connected via null-modem cable to a RS-232 port.

Reception at the PC end works fine, but reception at the ice40 end never works.

This seems related to the fact that the FT2232H chip on the breakout board is run from the same 12MHz clock, but the baud generator in that chip, and the one generated from the ice40 PLL (at 48MHZ, divided down) , do not always get the same phase.

This seems to get much worse when playing with alternate PLL settings (eg, using generated verilog pll blocks from icepll with different ice40 derived clock rates.), and seems to explain the differences in shell.py stability comparing the j1a and j4a.

I'm going to switch over to bmentink's opencore rs232 uart, and see how that goes. Last time I tried, I couldn't get shell.py to work, but that may have been due to the pll differences between the j4a and the j1a.

RGD2 commented 7 years ago

Ok. I am fairly confident I understand this issue now.

Jame's original uart suffers from a circular dependency issue -> it's sampling at 2xbaud when idle, when it should be sampling as fast as possible at clk rate, until sure it's found the middle of the start bit.

This means it never really phases itself properly to an asynchronous serial signal, and only works part of the time with the onboard and incidentally synchronous ft2232h on the iceStick and hx8k breakout boards. (and any other feeding the same oscillator to both chips). It also doesn't filter the input for metastability or glitches, so sometimes it happens to be phased 'far enough' from the edges to work, and sometime it starts up too close and gets garbled data.

This also seems to explain why j1a/j4a both seem to never work when clocked by the ice40 PLL running at anything other than 12 or 48 MHz. And the latter only with PLL settings which icepll says are actually invalid. What's probably happening is the buart is breaking communications in those cases, because the PLL output doesn't happen to start up with the coincidental yet suitable phase dependency compared to the FT2232's baud generator.

Finally, I'd assert that it's a bug to retain only the first byte received (until collected by the io subsystem), rather than replacing the uncollected data with the freshest byte. If something isn't collecting the data, then when it does start taking it, it should be expecting not the first glitch long ago from start up, but the most recent, validly received byte.

I did try for a while to get bmentink's opencores part to work, but was more or less stymied by the fairly horrible style it's state machines are written in. (blocking assignments in synchronous code, rather than non-blocking). It was having different behaviour depending on how it was connecting, so I eventually grew frustrated with it and wrote my own.

It may also help with #39 and possibly fix #15 as well.

It's been tested on both a ice40hx8k breakout board with the j4a as well as the j1a on the iCEstick. It appears to be reliable at 38400, 460800 and even 921600, although it could probably do with more testing.

I'm going to rebase it and prepare a pull request soon....

bmentink commented 7 years ago

Sounds great, good work, if you want some more testing please let me know what branch you have your current code checked into, and I will try and thrash it ..

RGD2 commented 7 years ago

@bmentink The branch is called fix#45, on my swapforth repo. it's pull request #48.

''' git remote add rgd2 https://github.com/RGD2/swapforth git fetch rgd2 fix#45 git checkout fix#45 '''

... Ought to get it out for you.

bmentink commented 7 years ago

Hi,

That works a treat, tried both 460800 and 921600 on the 8k board. No issues so far at 48Mhz, will try higher clocks ...

Cheers, B.

On Wed, May 3, 2017 at 9:10 AM, RGD2 notifications@github.com wrote:

@bmentink https://github.com/bmentink The branch is called fix#45, on my swapforth repo. it's pull request #48 https://github.com/jamesbowman/swapforth/pull/48.

''' git remote add rgd2 https://github.com/RGD2/swapforth git fetch rgd2 fix#45 git checkout fix#45 '''

... Ought to get it out for you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jamesbowman/swapforth/issues/45#issuecomment-298762324, or mute the thread https://github.com/notifications/unsubscribe-auth/AJp6h4j4Y9PPCLuYYXranmDPYalNfCJIks5r15vMgaJpZM4Ki80M .

jamesbowman commented 6 years ago

@RGD2 would be good to give this new UART a try - but I can no longer see a pull request? Did I miss it?

igor-m commented 6 years ago

The original uart.v with mecrisp-ice (UP5k, external usb serial dongle) does not work with any PLL based clk and all usual baudrates, nor at any baudrate w/ 30MHz external oscillator. RX corrupts the incoming data. It worked w/ 20MHz external oscillator and 115k2/230k4 only.

With uart2.v it works at 30MHz ext osc and PLL and 115k2. Not tested with original j1a.

Frankly, below assign used in the original uart.v is something which may cause the issues:

assign ser_clk = (counter == limit);

I would replace that with something which is registered.. Or change the counter design. I can see the RX behavior changes with baudrate, even when you go down to 9k6 when the counter will be wider.

It infers most probably a ripple counter, where the lsb->msb propagation delay depends on the counter width. The ser_clk pulse will be delayed by the counter prop delay such it comes at wrong moment.. It may also happen a result becomes shorter than allowed in regard to the always @* logic outputs where the ser_clk is used.

It seems it works better with higher baudrates, where the counter is shorter (thus the counter's prop delay is smaller and does not affect the ser_clk's log1 pulse position much).

jamesbowman / swapforth

serial data corruption at 460800 #45