Closed MCUdude closed 4 years ago
I released a few bit-bang uart versions. Here's the last one: https://github.com/nerdralph/nerdralph/tree/master/avr/libs/bbuart In this version the baud rate is determined at compile time, which I think is fine for the vast majority of use cases. You could probably do something with a macro for Serial.begin() to set the baud rate instead of defining the baud rate before including BBUart.h
At 9.6Mhz, it can do 12,400-192,000bps at <3% timing error. 81N can handle a timing error of up to 1/2 bit-time over 10 bits (start + 8 + stop), or 5%. That would permit up to 320kbps@9.6Mhz. I think 57,6 and 115.2 are the main usecases, so I made sure it worked well for those speeds. To go down to 9600bps, you'd need to reduce the clock speed to 4.8, which I think shouldn't be an issue for most people.
Here is a php script to generate the data, I think this is right, or at least, convincing. It's based on the makefile I use for building optiboot which has what should, if I ported it correctly, the same calculation.
I've attached the output in case you can't use PHP.
Note that I would recommend reducing your maximum recommended error to 2%, in my experience, 3% is a bit iffy particularly with CH340
#!/usr/bin/php
<?php
$CSVOutput = array("AVR_FREQ,BAUD_RATE,BAUD_ACTUAL,ERROR %\n");
foreach(array(20,16,12,9.6,8,4.8,1.2,1,0.6,0.128) as $AVR_FREQ)
{
$AVR_FREQ *= 1000000;
foreach(array(460800, 150000, 115200, 57600, 38400, 19200, 9600, 2400, 1200, 300) as $BAUD_RATE)
{
if(( 8 * (( ($AVR_FREQ + $BAUD_RATE * 4) / (($BAUD_RATE * 8))) - 1 ) ) == 0) continue;
$BAUD_ACTUAL = $AVR_FREQ / ( 8 * (( ($AVR_FREQ + $BAUD_RATE * 4) / (($BAUD_RATE * 8))) - 1 ) );
$Error = (( 100*( $BAUD_RATE - $BAUD_ACTUAL) ) / $BAUD_RATE);
if(abs($Error) <= 3)
{
echo "Frequency : $AVR_FREQ\n";
echo "Desired : $BAUD_RATE\n";
echo "Achieved (Theoretically): $BAUD_ACTUAL\n";
echo "Error % : ";
echo abs(round($Error,2)) . "\n";
echo "\n\n";
$CSVOutput[] = "{$AVR_FREQ},{$BAUD_RATE},{$BAUD_ACTUAL},{$Error}\n";
}
}
}
echo implode($CSVOutput,"");
?>
@sleemanj Have you tried my soft uart? Optiboot uses what looks to be a version of the AVR305 soft uart, which has 1 cycle of jitter between 1 and 0 bits, while mine has no jitter. To calculate the true error margin you'd need to include the jitter. Mine has a bit-time resolution of 3 cycles, while Optiboot's is 6 cycles. My 5% error calculation is based on a 30 cycle minimum bit time at +- 1.5. Optiboot is +-3.5 cycles (6 cycle delay resolution + 1 cycle jitter), so the 5% error threshold would be at 70 cycles per bit, or 114.3kbps at 8Mhz. If the average jitter isn't factored in the UART_B_VALUE calculation, Optiboot may have +-4 cycle accuracy, meaning the 5% threshold is 100kbps at 8Mhz.
Sorry should have been clear, those calculations are not specifically for any software serial, they are hardware serial so ideal conditions.
I do use a lightly modified version of one of your implementations in my ATTinycore...
https://github.com/sleemanj/ATTinyCore/blob/master/avr/cores/tiny/HalfDuplexSerial.cpp https://github.com/sleemanj/ATTinyCore/blob/master/avr/cores/tiny/HalfDuplexSerial.h https://github.com/sleemanj/ATTinyCore/blob/master/avr/cores/tiny/HalfDuplexSerial.S
OK, James, that's the jitter-free version you have in ATTinyCore. You could make a couple small tweaks. Setting the port to output mode could be moved to the Core startup code. Or if you want to be really fancy, you could move the line "sbi UART_Port-1, UART_Tx" to the end of the TxByte function in section .init8. That means it should run before main, and with the right linker magic with gc-sections, will only get added to .init8 when the TxByte function is used. The other little tweak is to change the cli + ret to reti. I mentioned it here rather than opening an issue on ATTinyCore since Hans will probably want to make TxByte ISR safe like you did by adding the sei/cli. I don't know if you do, but I'd put a warning in the documentation about using interrupts and low baud rates. Transmitting serial data 19,200 can add up to 520us of latency to interrupt handling.
Thanks for the script @sleemanj! I ran it in an online PHP sandbox. I have yet to try every single option on my ATtiny13 dev board (with an on-board crystal driver). I indeed want less than 2% error, since I use the CH330N and CH340C in most projects where I need a USB to serial adapter. My favorite at the moment is the CH330N, because of the small SOIC-8 package.
Here's the output from the script when the error is allowed to be 2% or less. But I remember I read somewhere that @nerdralph's code was able to use 460800 baud when running at 16 MHz? This script calculates this error to be 13.02%. Is this correct?
EDIT: Just tested 460800 baud @ 16 MHz. It works like a charm! I believe the calculations aren't 100% correct. 13.02% error would only print garbage.
EDIT2: I used my oscilloscope to measure the bit length (sending a 0x20 character every second), and manually calculated the error. 460800 baud @ 16 MHz results in an error of -1.98%, or a bit length of 2.128us instead of 2.1701us.
|AVR_FREQ|BAUD_RATE|BAUD_ACTUAL |ERROR % |
|--------|---------|---------------|-------------------|
|20000000|57600 |58271.285205568|-1.1654257041114 |
|20000000|38400 |38697.194453402|-0.77394388906803 |
|20000000|19200 |19274.012206874|-0.38548024413747 |
|20000000|9600 |9618.4674575184|-0.19236934915036 |
|20000000|2400 |2401.1525532255|-0.048023051064509 |
|20000000|1200 |1200.2880691366|-0.02400576138272 |
|20000000|300 |300.01800108006|-0.0060003600215926|
|16000000|57600 |58441.558441558|-1.461038961039 |
|16000000|38400 |38772.213247173|-0.96930533117931 |
|16000000|19200 |19292.604501608|-0.48231511254018 |
|16000000|9600 |9623.0954290297|-0.24057738572573 |
|16000000|2400 |2401.4408645187|-0.060036021612954 |
|16000000|1200 |1200.3601080324|-0.030009002700808 |
|16000000|300 |300.02250168763|-0.0075005625421909|
|12000000|57600 |58727.569331158|-1.9575856443719 |
|12000000|38400 |38897.893030794|-1.2965964343598 |
|12000000|19200 |19323.671497585|-0.64412238325283 |
|12000000|9600 |9630.8186195827|-0.32102728731941 |
|12000000|2400 |2401.9215372298|-0.080064051241 |
|12000000|1200 |1200.4801920768|-0.040016006402558 |
|12000000|300 |300.0300030003 |-0.010001000100014 |
|9600000 |38400 |39024.390243902|-1.6260162601626 |
|9600000 |19200 |19354.838709677|-0.80645161290323 |
|9600000 |9600 |9638.5542168675|-0.4016064257028 |
|9600000 |2400 |2402.4024024024|-0.1001001001001 |
|9600000 |1200 |1200.6003001501|-0.050025012506258 |
|9600000 |300 |300.03750468809|-0.012501562695339 |
|8000000 |38400 |39151.712887439|-1.9575856443719 |
|8000000 |19200 |19386.106623586|-0.96930533117931 |
|8000000 |9600 |9646.3022508039|-0.48231511254018 |
|8000000 |2400 |2402.8834601522|-0.12014417300761 |
|8000000 |1200 |1200.7204322594|-0.060036021612954 |
|8000000 |300 |300.04500675101|-0.015002250337545 |
|4800000 |19200 |19512.195121951|-1.6260162601626 |
|4800000 |9600 |9677.4193548387|-0.80645161290323 |
|4800000 |2400 |2404.8096192385|-0.2004008016032 |
|4800000 |1200 |1201.2012012012|-0.1001001001001 |
|4800000 |300 |300.07501875469|-0.0250062515629 |
|1200000 |2400 |2419.3548387097|-0.80645161290323 |
|1200000 |1200 |1204.8192771084|-0.4016064257028 |
|1200000 |300 |300.3003003003 |-0.1001001001001 |
|1000000 |2400 |2423.2633279483|-0.96930533117931 |
|1000000 |1200 |1205.7877813505|-0.48231511254018 |
|1000000 |300 |300.36043251902|-0.12014417300761 |
|600000 |2400 |2439.0243902439|-1.6260162601626 |
|600000 |1200 |1209.6774193548|-0.80645161290323 |
|600000 |300 |300.60120240481|-0.2004008016032 |
|128000 |300 |302.83911671924|-0.94637223974763 |
Sounds like the calculation isn't entirley on the money then.
avr-libc includes similar calculation (probably where I derived this from long ago) in setbaud.h, you should have a copy on your system somewhere but pasted below, it's just some calculations done in macros, see BAUD_TOL as the tolerance.
/* Copyright (c) 2007 Cliff Lawson
Copyright (c) 2007 Carlos Lamas
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the
distribution.
* Neither the name of the copyright holders nor the names of
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE. */
/* $Id$ */
/**
\file
*/
/**
\defgroup util_setbaud <util/setbaud.h>: Helper macros for baud rate calculations
\code
#define F_CPU 11059200
#define BAUD 38400
#include <util/setbaud.h>
\endcode
This header file requires that on entry values are already defined
for F_CPU and BAUD. In addition, the macro BAUD_TOL will define
the baud rate tolerance (in percent) that is acceptable during
the calculations. The value of BAUD_TOL will default to 2 %.
This header file defines macros suitable to setup the UART baud
rate prescaler registers of an AVR. All calculations are done
using the C preprocessor. Including this header file causes no
other side effects so it is possible to include this file more than
once (supposedly, with different values for the BAUD parameter),
possibly even within the same function.
Assuming that the requested BAUD is valid for the given F_CPU then
the macro UBRR_VALUE is set to the required prescaler value. Two
additional macros are provided for the low and high bytes of the
prescaler, respectively: UBRRL_VALUE is set to the lower byte of
the UBRR_VALUE and UBRRH_VALUE is set to the upper byte. An
additional macro USE_2X will be defined. Its value is set to 1 if
the desired BAUD rate within the given tolerance could only be
achieved by setting the U2X bit in the UART configuration. It will
be defined to 0 if U2X is not needed.
Example usage:
\code
#include <avr/io.h>
#define F_CPU 4000000
static void
uart_9600(void)
{
#define BAUD 9600
#include <util/setbaud.h>
UBRRH = UBRRH_VALUE;
UBRRL = UBRRL_VALUE;
#if USE_2X
UCSRA |= (1 << U2X);
#else
UCSRA &= ~(1 << U2X);
#endif
}
static void
uart_38400(void)
{
#undef BAUD // avoid compiler warning
#define BAUD 38400
#include <util/setbaud.h>
UBRRH = UBRRH_VALUE;
UBRRL = UBRRL_VALUE;
#if USE_2X
UCSRA |= (1 << U2X);
#else
UCSRA &= ~(1 << U2X);
#endif
}
\endcode
In this example, two functions are defined to setup the UART
to run at 9600 Bd, and 38400 Bd, respectively. Using a CPU
clock of 4 MHz, 9600 Bd can be achieved with an acceptable
tolerance without setting U2X (prescaler 25), while 38400 Bd
require U2X to be set (prescaler 12).
*/
#ifndef F_CPU
# error "setbaud.h requires F_CPU to be defined"
#endif
#ifndef BAUD
# error "setbaud.h requires BAUD to be defined"
#endif
#if !(F_CPU)
# error "F_CPU must be a constant value"
#endif
#if !(BAUD)
# error "BAUD must be a constant value"
#endif
#if defined(__DOXYGEN__)
/**
\def BAUD_TOL
\ingroup util_setbaud
Input and output macro for <util/setbaud.h>
Define the acceptable baud rate tolerance in percent. If not set
on entry, it will be set to its default value of 2.
*/
#define BAUD_TOL 2
/**
\def UBRR_VALUE
\ingroup util_setbaud
Output macro from <util/setbaud.h>
Contains the calculated baud rate prescaler value for the UBRR
register.
*/
#define UBRR_VALUE
/**
\def UBRRL_VALUE
\ingroup util_setbaud
Output macro from <util/setbaud.h>
Contains the lower byte of the calculated prescaler value
(UBRR_VALUE).
*/
#define UBRRL_VALUE
/**
\def UBRRH_VALUE
\ingroup util_setbaud
Output macro from <util/setbaud.h>
Contains the upper byte of the calculated prescaler value
(UBRR_VALUE).
*/
#define UBRRH_VALUE
/**
\def USE_2X
\ingroup util_setbaud
Output macro from <util/setbaud.h>
Contains the value 1 if the desired baud rate tolerance could only
be achieved by setting the U2X bit in the UART configuration.
Contains 0 otherwise.
*/
#define USE_2X 0
#else /* !__DOXYGEN__ */
#undef USE_2X
/* Baud rate tolerance is 2 % unless previously defined */
#ifndef BAUD_TOL
# define BAUD_TOL 2
#endif
#ifdef __ASSEMBLER__
#define UBRR_VALUE (((F_CPU) + 8 * (BAUD)) / (16 * (BAUD)) -1)
#else
#define UBRR_VALUE (((F_CPU) + 8UL * (BAUD)) / (16UL * (BAUD)) -1UL)
#endif
#if 100 * (F_CPU) > \
(16 * ((UBRR_VALUE) + 1)) * (100 * (BAUD) + (BAUD) * (BAUD_TOL))
# define USE_2X 1
#elif 100 * (F_CPU) < \
(16 * ((UBRR_VALUE) + 1)) * (100 * (BAUD) - (BAUD) * (BAUD_TOL))
# define USE_2X 1
#else
# define USE_2X 0
#endif
#if USE_2X
/* U2X required, recalculate */
#undef UBRR_VALUE
#ifdef __ASSEMBLER__
#define UBRR_VALUE (((F_CPU) + 4 * (BAUD)) / (8 * (BAUD)) -1)
#else
#define UBRR_VALUE (((F_CPU) + 4UL * (BAUD)) / (8UL * (BAUD)) -1UL)
#endif
#if 100 * (F_CPU) > \
(8 * ((UBRR_VALUE) + 1)) * (100 * (BAUD) + (BAUD) * (BAUD_TOL))
# warning "Baud rate achieved is higher than allowed"
#endif
#if 100 * (F_CPU) < \
(8 * ((UBRR_VALUE) + 1)) * (100 * (BAUD) - (BAUD) * (BAUD_TOL))
# warning "Baud rate achieved is lower than allowed"
#endif
#endif /* USE_U2X */
#ifdef UBRR_VALUE
/* Check for overflow */
# if UBRR_VALUE >= (1 << 12)
# warning "UBRR value overflow"
# endif
# define UBRRL_VALUE (UBRR_VALUE & 0xff)
# define UBRRH_VALUE (UBRR_VALUE >> 8)
#endif
#endif /* __DOXYGEN__ */
/* end of util/setbaud.h */
After some cursor readouts on my scope, I came up with a crude error formula that's at least in the ballpark.
(( 100 * ( BAUD_RATE - F_CPU / ( 8 * (( (F_CPU + BAUD_RATE * 7,33920732659) / ((BAUD_RATE * 8))) - 1 ) )) ) / BAUD_RATE)
I calculated the long decimal number based on a few readouts. This means this formula is by no means perfect, but I think it's good enough to determine if we're OK or way out. I don't know the code well enough to provide a "correct" one.
Here's the calculated result is acceptable. I can confirm that 115200 baud is working fine when using the internal 4.8 MHz oscillator where it has been tuned by using @sleemanj's OSCCAL sketch.
@nerdralph is it wise to use the highest possible baud rate whenever possible to use as little CPU time as possible? I was thinking about using these default values:
EDIT: 19200 baud is causing the lto wrapper to crash for some reason.
Clock | Default baud rate |
---|---|
20 MHz | 115200 |
16 MHz | 115200 |
12 MHz | 115200 |
9.6 MHz | 115200 |
8 MHz | 115200 |
4.8 MHz | 115200 |
1.2 MHz | |
1 MHz | ~19200~ 9600 |
600 kHz | ~19200~ 9600 |
128 kHz | Not supported (inaccurate, slow, no OSCCAL) |
Calculated error (only <2.3% shown)
F_CPU | Baud rate | % Error |
---|---|---|
20000000 | 460800 | -1,54600371 |
20000000 | 250000 | -0,83287027 |
20000000 | 230400 | -0,76707237 |
20000000 | 115200 | -0,3820708 |
20000000 | 57600 | -0,19067115 |
20000000 | 38400 | -0,12703336 |
20000000 | 19200 | -0,06347636 |
20000000 | 9600 | -0,03172811 |
20000000 | 4800 | -0,01586154 |
20000000 | 2400 | -0,00793014 |
20000000 | 1200 | -0,00396491 |
20000000 | 600 | -0,00198242 |
20000000 | 300 | -0,0009912 |
16000000 | 460800 | -1,94000276 |
16000000 | 250000 | -1,04326009 |
16000000 | 230400 | -0,96068274 |
16000000 | 115200 | -0,47804512 |
16000000 | 57600 | -0,23845261 |
16000000 | 38400 | -0,15884215 |
16000000 | 19200 | -0,07935805 |
16000000 | 9600 | -0,03966329 |
16000000 | 4800 | -0,01982771 |
16000000 | 2400 | -0,00991287 |
16000000 | 1200 | -0,00495619 |
16000000 | 600 | -0,00247803 |
16000000 | 300 | -0,001239 |
12000000 | 250000 | -1,39586763 |
12000000 | 230400 | -1,28502533 |
12000000 | 115200 | -0,6384108 |
12000000 | 57600 | -0,31818972 |
12000000 | 38400 | -0,21190173 |
12000000 | 19200 | -0,10583873 |
12000000 | 9600 | -0,05289137 |
12000000 | 4800 | -0,0264387 |
12000000 | 2400 | -0,0132176 |
12000000 | 1200 | -0,00660836 |
12000000 | 600 | -0,00330407 |
12000000 | 300 | -0,00165201 |
9600000 | 250000 | -1,75094476 |
9600000 | 230400 | -1,61145858 |
9600000 | 115200 | -0,79928918 |
9600000 | 57600 | -0,39805379 |
9600000 | 38400 | -0,26501756 |
9600000 | 19200 | -0,13233342 |
9600000 | 9600 | -0,06612296 |
9600000 | 4800 | -0,03305055 |
9600000 | 2400 | -0,01652255 |
9600000 | 1200 | -0,00826059 |
9600000 | 600 | -0,00413012 |
9600000 | 300 | -0,00206502 |
8000000 | 250000 | -2,10851751 |
8000000 | 230400 | -1,94000276 |
8000000 | 115200 | -0,96068274 |
8000000 | 57600 | -0,47804512 |
8000000 | 38400 | -0,31818972 |
8000000 | 19200 | -0,15884215 |
8000000 | 9600 | -0,07935805 |
8000000 | 4800 | -0,03966329 |
8000000 | 2400 | -0,01982771 |
8000000 | 1200 | -0,00991287 |
8000000 | 600 | -0,00495619 |
8000000 | 300 | -0,00247803 |
4800000 | 115200 | -1,61145858 |
4800000 | 57600 | -0,79928918 |
4800000 | 38400 | -0,53144353 |
4800000 | 19200 | -0,26501756 |
4800000 | 9600 | -0,13233342 |
4800000 | 4800 | -0,06612296 |
4800000 | 2400 | -0,03305055 |
4800000 | 1200 | -0,01652255 |
4800000 | 600 | -0,00826059 |
4800000 | 300 | -0,00413012 |
1200000 | 38400 | -2,16021509 |
1200000 | 19200 | -1,06856589 |
1200000 | 9600 | -0,53144353 |
1200000 | 4800 | -0,26501756 |
1200000 | 2400 | -0,13233342 |
1200000 | 1200 | -0,06612296 |
1200000 | 600 | -0,03305055 |
1200000 | 300 | -0,01652255 |
600000 | 19200 | -2,16021509 |
600000 | 9600 | -1,06856589 |
600000 | 4800 | -0,53144353 |
600000 | 2400 | -0,26501756 |
600000 | 1200 | -0,13233342 |
600000 | 600 | -0,06612296 |
600000 | 300 | -0,03305055 |
128000 | 2400 | -1,25452971 |
128000 | 1200 | -0,62335477 |
128000 | 600 | -0,31070898 |
128000 | 300 | -0,15511351 |
The calculations for the baud rate error of the hardware USART is irrelevant. As I mentioned before, my uart code is accurate to within +-1.5 cycles. The number of cycles per bit is 7 + 3*TXDELAY, where TXDELAY is calculated by the header file macros based on F_CPU and BAUD_RATE. At 115.2kbps, the ideal time for each bit is 8.681uS. With a 4.8Mhz clock, that's 41.67 cycles. The macros will calculate the best TXDELAY of 12, for a delay per bit of 43 cycles. In this instance the uart will be slow by 43/41.67 or 3.19%. Adding a 1% variation for a reasonably-tuned OSCCAL, the total error is less than the 5% margin required for 8N1. I'd stick with 115.2 for 4.8Mhz and up. For 1.2Mhz and lower, I'd go with a default of 38,400bps.
The calculations for the baud rate error of the hardware USART is irrelevant.
My conclusion too after spending some time testing on actual hardware.
It turned that some of the lower clock speed supported a higher baud rate that I had initially thought. Here's an "updated" default baudrates table:
Clock | Default baud rate |
---|---|
20 MHz | 115200 |
16 MHz | 115200 |
12 MHz | 115200 |
9.6 MHz | 115200 |
8 MHz | 115200 |
4.8 MHz | 115200 |
1.2 MHz | 38400 |
1 MHz | 38400 |
600 kHz | ~19200~ 9600 |
128 kHz | Not supported (inaccurate, slow, no OSCCAL) |
For some strange reason, I'm not allowed to use 19200 baud for ANY F_CPU. I'm just getting this error:
In file included from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Arduino.h:119:0,
from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.cpp:24:
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h: In function 'void dummy()':
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:48: warning: integer overflow in expression [-Woverflow]
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:58:51: note: in definition of macro 'DIVIDE_ROUNDED'
#define DIVIDE_ROUNDED(NUMERATOR, DIVISOR) ((((2*(NUMERATOR))/(DIVISOR))+1)/2)
^~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:24: note: in expansion of macro 'DIVIDE_ROUNDED'
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:66:37: note: in expansion of macro 'RXSTART_CYCLES'
#define RXSTARTCOUNT DIVIDE_ROUNDED(RXSTART_CYCLES - 13, 3)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:108:23: note: in expansion of macro 'RXSTARTCOUNT'
::[rxscount] "M" (RXSTARTCOUNT)
^~~~~~~~~~~~
In file included from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Arduino.h:119:0,
from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\main.cpp:12:
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h: In function 'void dummy()':
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:48: warning: integer overflow in expression [-Woverflow]
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:58:51: note: in definition of macro 'DIVIDE_ROUNDED'
#define DIVIDE_ROUNDED(NUMERATOR, DIVISOR) ((((2*(NUMERATOR))/(DIVISOR))+1)/2)
^~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:24: note: in expansion of macro 'DIVIDE_ROUNDED'
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:66:37: note: in expansion of macro 'RXSTART_CYCLES'
#define RXSTARTCOUNT DIVIDE_ROUNDED(RXSTART_CYCLES - 13, 3)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:108:23: note: in expansion of macro 'RXSTARTCOUNT'
::[rxscount] "M" (RXSTARTCOUNT)
^~~~~~~~~~~~
In file included from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Arduino.h:119:0,
from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Print.cpp:30:
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h: In function 'void dummy()':
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:48: warning: integer overflow in expression [-Woverflow]
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:58:51: note: in definition of macro 'DIVIDE_ROUNDED'
#define DIVIDE_ROUNDED(NUMERATOR, DIVISOR) ((((2*(NUMERATOR))/(DIVISOR))+1)/2)
^~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:24: note: in expansion of macro 'DIVIDE_ROUNDED'
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:66:37: note: in expansion of macro 'RXSTART_CYCLES'
#define RXSTARTCOUNT DIVIDE_ROUNDED(RXSTART_CYCLES - 13, 3)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:108:23: note: in expansion of macro 'RXSTARTCOUNT'
::[rxscount] "M" (RXSTARTCOUNT)
^~~~~~~~~~~~
In file included from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Arduino.h:119:0,
from C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\Tone.cpp:25:
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h: In function 'void dummy()':
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:48: warning: integer overflow in expression [-Woverflow]
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:58:51: note: in definition of macro 'DIVIDE_ROUNDED'
#define DIVIDE_ROUNDED(NUMERATOR, DIVISOR) ((((2*(NUMERATOR))/(DIVISOR))+1)/2)
^~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:64:24: note: in expansion of macro 'DIVIDE_ROUNDED'
#define RXSTART_CYCLES DIVIDE_ROUNDED(3*F_CPU,2*BAUD_RATE)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:66:37: note: in expansion of macro 'RXSTART_CYCLES'
#define RXSTARTCOUNT DIVIDE_ROUNDED(RXSTART_CYCLES - 13, 3)
^~~~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:108:23: note: in expansion of macro 'RXSTARTCOUNT'
::[rxscount] "M" (RXSTARTCOUNT)
^~~~~~~~~~~~
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h: In function 'dummy':
C:\Users\h.bull.LAUDM\Documents\Arduino\hardware\MicroCore\avr\cores\microcore\HalfDuplexSerial.h:109:6: error: impossible constraint in 'asm'
);
^
lto-wrapper.exe: fatal error: C:\Users\h.bull.LAUDM\AppData\Local\Arduino15\packages\arduino\tools\avr-gcc\7.3.0-atmel3.6.1-arduino5/bin/avr-gcc returned 1 exit status
compilation terminated.
c:/users/h.bull.laudm/appdata/local/arduino15/packages/arduino/tools/avr-gcc/7.3.0-atmel3.6.1-arduino5/bin/../lib/gcc/avr/7.3.0/../../../../avr/bin/ld.exe: error: lto-wrapper failed
collect2.exe: error: ld returned 1 exit status
exit status 1
Error compiling for board ATtiny13.
Those macros were a pain to write, and as you probably can guess, even more annoying to debug. I'll take a quick look but won't spend much time on it. In the 5+ years since I wrote them, avr-gcc with LTO has significantly changed/improved, and so I'll take a shot at rewriting the baud rate calculation code so that it doesn't use macros any more.
Those macros were a pain to write, and as you probably can guess, even more, annoying to debug. I'll take a quick look but won't spend much time on it.
I can only imagine! It's not a big deal at all. I'll simply specify that 19200 baud isn't supported. However, If you do sort it out I'll, of course, add the fix to this repo.
The current code is available in the HalfDuplexSerial branch if you need something to test on. The baud rate can be overridden by modifying the core_settings.h file.
I'll later also add an OSCCAL sketch so the internal oscillator can be used. The T13 I have on my desk is pretty terrible. The default OSCCAL value was 92 (decimal), but for the 4.8 MHz oscillator I had to adjust it all the way to 105 in order to be as accurate as it can be based on the OSCCAL resolution. For the 9.6 MHz, one a value of 98 resulted in the most accurate clock.
I figured out the error with 19200bps. gcc implicitly determines the type of BAUD_RATE based on the value. 9600 and 19200 are int16, while 38400 and above are int32. The RXSTART macro multiplies the baud rate by 2 (to start reading half-way through the bit). 2 * 19200 as an unsigned 16-bit integer results in an integer overflow as indicated in the warning. Setting BAUD_RATE to 19200L tells GCC the type is Long (32-bit), which will not overflow.
BTW, I've done a LOT of experimenting with OSCCAL and accuracy/precision. The parts I've bought from Newark (t85, t84a, t88) in the past 7 years have all been within 1% at 3.3 & 5v. The t13's I've bought off Aliexpress are the only parts that have been way out. They are still within the +-10% datasheet spec, so my guess is these are b-grade parts specifically for the Chinese market. For low-end chips like the t13, packaging and testing can cost more than the die. Skipping OSCCAL calibration on the parts would save time/money. You can actually duplicate the factory OSCCAL modification by using the undocumented signature page write programming command. I found an old thread in avrfreaks where someone figured out the HVSP command, and I was able to verify that it works with a HVSP programmer that I built. http://nerdralph.blogspot.com/2018/05/piggyfuse-hvsp-avr-fuse-programmer.html It's also supposedly possible to program it with the standard ICSP. I only know how to erase the signature page through ICSP, and didn't figure out how to program it. Maybe one day...
I've also been thinking about ways of automatically detecting the target speed at flash programming time. Last summer I got speed detection working with DebugWire. https://github.com/nerdralph/nerdralph/blob/master/autobaud.py I think it may also possible to detect the target speed during ICSP by timing the delay between SCK transitions and MISO.
With a modified programmer, the target speed could be reported back to the host using avrdude extended parameters. The IDE could then use an option like the tools get board info to read the signature, fuses, calibration bytes, and RC oscillator speed.
I figured out the error with 19200bps. gcc implicitly determines the type of BAUD_RATE based on the value. 9600 and 19200 are int16, while 38400 and above are int32.
Great! Adding L at the end did the trick. BTW is there a way of making the preprocessor adding the L afterward, so that the user doesn't have to do it? Again, not a big deal, but still a nice touch.
BTW, I've done a LOT of experimenting with OSCCAL and accuracy/precision. The parts I've bought from Newark (t85, t84a, t88) in the past 7 years have all been within 1% at 3.3 & 5v. The t13's I've bought off Aliexpress are the only parts that have been way out. They are still within the +-10% datasheet spec, so my guess is these are b-grade parts specifically for the Chinese market. For low-end chips like the t13, packaging and testing can cost more than the die. Skipping OSCCAL calibration on the parts would save time/money. You can actually duplicate the factory OSCCAL modification by using the undocumented signature page write programming command. I found an old thread in avrfreaks where someone figured out the HVSP command, and I was able to verify that it works with a HVSP programmer that I built. http://nerdralph.blogspot.com/2018/05/piggyfuse-hvsp-avr-fuse-programmer.html It's also supposedly possible to program it with the standard ICSP. I only know how to erase the signature page through ICSP, and didn't figure out how to program it. Maybe one day...
Interesting read! But what do we achieve by modifying the factory OSCCAL value? Will the "custom" OSCCAL values be loaded on boot without having to manually do it using EEPROM storage?
I've also been thinking about ways of automatically detecting the target speed at flash programming time. Last summer I got speed detection working with DebugWire. https://github.com/nerdralph/nerdralph/blob/master/autobaud.py I think it may also possible to detect the target speed during ICSP by timing the delay between SCK transitions and MISO.
With a modified programmer, the target speed could be reported back to the host using avrdude extended parameters. The IDE could then use an option like the tools get board info to read the signature, fuses, calibration bytes, and RC oscillator speed.
So, in theory, it could be possible for the programmer to tweak the OSCCAL value to get the clock as accurate as possible without using the method described in AVR053?
the total error is less than the 5% margin required for 8N
Correct me if I'm wrong, but that's total 5% error, if you have one end 3% slow and the other end 3% fast you're going to have a bad day.
I figured out the error with 19200bps. gcc implicitly determines the type of BAUD_RATE based on the value. 9600 and 19200 are int16, while 38400 and above are int32.
Great! Adding L at the end did the trick. BTW is there a way of making the preprocessor adding the L afterward, so that the user doesn't have to do it? Again, not a big deal, but still a nice touch.
OK, after a bunch of fighting with the preprocessor, I came up with a simple solution. Mulitplying by a long (2L) guarantees the BAUD_RATE always gets promoted to a long. I pushed the change to BBUart.h in my github repo.
http://nerdralph.blogspot.com/2018/05/piggyfuse-hvsp-avr-fuse-programmer.html It's also supposedly possible to program it with the standard ICSP. I only know how to erase the signature page through ICSP, and didn't figure out how to program it. Maybe one day...
Interesting read! But what do we achieve by modifying the factory OSCCAL value? Will the "custom" OSCCAL values be loaded on boot without having to manually do it using EEPROM storage? Yes, that's the point. Part of the power-up sequence is loading the factory-set OSSCAL value from the signature page in flash into the OSCCAL register. By re-writing the signature page with a new OSCCAL value, the new value is what gets loaded at reset.
With a modified programmer, the target speed could be reported back to the host using avrdude extended parameters. The IDE could then use an option like the tools get board info to read the signature, fuses, calibration bytes, and RC oscillator speed.
So, in theory, it could be possible for the programmer to tweak the OSCCAL value to get the clock as accurate as possible without using the method described in AVR053?
Yes. I've already tested the concept using Makefiles and DebugWire with a Pl2303HX for a programmer. One makefile rule detects the target type and clock rate, creating a make.defs with the target device and F_CPU. Some ideas I've thought about that could integrate with the Arduino IDE are a custom USBasp firmware, or a custom programmer that is STK500 compatible which the host communicates with using a standard USB-TTL adapter. Realistically I'll probably stick to something that works from the command line and Makefiles, since trying to support serious AVR development with the Arduino IDE feels like trying to tune up a Yugo for the racetrack.
I really like DebugWire, particularly on small parts like the t13. You get a half-duplex uart for free - no flash used on the target. I also like having PB0-PB4 completely free. When developing with a USBasp connected, I'm limited in what I can use PB0-PB2 for, else I risk interfering with ICSP.
the total error is less than the 5% margin required for 8N
Correct me if I'm wrong, but that's total 5% error, if you have one end 3% slow and the other end 3% fast you're going to have a bad day.
If your USB-TTL adapters are only 3% accurate, you've been having a LOT of bad days. Even the cheapest oscillators are spec'd to 50ppm over their full temperature range. Supposing you get some real junk parts that leave out the caps on the oscillator, you're still within 100ppm. I bought a bunch of PL2303HX adapters for <50c, and they were all within 10-20ppm at 18-23C.
Since the TTL adapters are a reliable timing source, that's why you'll see them used for RC oscillator tuning. The target can time the frame from the host and use that to determine the internal oscillator frequency.
OK, after a bunch of fighting with the preprocessor, I came up with a simple solution. Multiplying by a long (2L) guarantees the BAUD_RATE always gets promoted to a long. I pushed the change to BBUart.h in my github repo.
Brilliant, that did the trick!
Since the TTL adapters are a reliable timing source, that's why you'll see them used for RC oscillator tuning. The target can time the frame from the host and use that to determine the internal oscillator frequency.
Does this mean in theory it should be possible for the T13 to output a known character and the host computer could calculate it's main clock frequency and perhaps how many OSCCAL steps needed in positive or negative direction? If so it should be possible to create a small shell script for easy calibration.
Since the TTL adapters are a reliable timing source, that's why you'll see them used for RC oscillator tuning. The target can time the frame from the host and use that to determine the internal oscillator frequency.
Does this mean in theory it should be possible for the T13 to output a known character and the host computer could calculate it's main clock frequency and perhaps how many OSCCAL steps needed in positive or negative direction? If so it should be possible to create a small shell script for easy calibration.
In theory yes, but in practice it's easier to go the other way with the host sending a known frame and the target timing and adjusting to it. The reason is a TTL adapter can't give you precise timing information about an incoming frame - you get a character that is the closest match. To get reliable timing, you have to send dozens or hundreds of pulses from the MCU, and have the host calculate the average time between the frames. Here's an example of this technique used to get very precise timing measurements when done over a period of hours: http://n1.taur.dk/nft/nft.pdf The timing program mentioned in that paper worked well enough for me to figure out that the oscillator in my 1054Z was slow by ~3ppm. http://nerdralph.blogspot.com/2015/07/rigol-ds1054z-frequency-counter-accuracy.html
Give me a day or two, and I'll write a basic calibration sketch to tune OSCCAL. It'll wait for a null (CTRL-@) from the host at 38,400bps, adjust OSCCAL based on the timing difference from ideal, and print the old and new OSSCAL values. After a few nulls from the host it will get to +-1 of the optimal OSCCAL value.
Give me a day or two, and I'll write a basic calibration sketch to tune OSCCAL. It'll wait for a null (CTRL-@) from the host at 38,400bps, adjust OSCCAL based on the timing difference from ideal, and print the old and new OSSCAL values. After a few nulls from the host it will get to +-1 of the optimal OSCCAL value.
Awesome, looking forward to test it! But how do I send a null character from a the Arduino serial monitor? It would be great if the user didn't have to install a third party serial monitor in order to calibrate.
@sleemanj I'd like to provide a few example sketches for the users to test out the serial functionality. Printing is all fine, but reading incoming data seems to be a bit more difficult than on a regular Arduino, and need some good examples.
Since we don't have an RX buffer and the Serial.read() function is non-blocking, how can we even receive data without using Serial.read_char_blocking()
?
I'd like to provide a simple echo program that just prints back whatever the user typed in the serial monitor. Can this be done without blocking receive code?
For tuning, while it is not gonna fit in a t13 here is my cut down version of "tinytuner" which I burn into optiboot images...
https://github.com/sleemanj/optiboot/blob/master/optiboot/bootloaders/optiboot/veryTinyTuner.c
you just repeatedly send "x". As I say, totally unsuited to T13, but maybe gives some ideas or copy-paste.
I'd like to provide a simple echo program that just prints back whatever the user typed in the serial monitor. Can this be done without blocking receive code?
?
Perfect, exactly what I was looking for! Is it OK for you if I borrow some of your examples and tweak them a bit?
Sure thing, grab whatever you want :)
@nerdralph I'm experiencing some issues when the ATtiny13 is reading data from the PC at high speeds. For instance, I'm able to write to the PC with 115200 baud @ 4.8 MHz, but I'm not able to read. I'm using James' ReadASCIIString sketch.
Is this normal behaviour?
For tuning, while it is not gonna fit in a t13 here is my cut down version of "tinytuner" which I burn into optiboot images...
https://github.com/sleemanj/optiboot/blob/master/optiboot/bootloaders/optiboot/veryTinyTuner.c
you just repeatedly send "x". As I say, totally unsuited to T13, but maybe gives some ideas or copy-paste.
I got a working tuner for the t13 going last night. 310 bytes of flash I was using 'p' (sticking to standard ASCII so it works in the Arduino serial monitor), but will probably switch to 'x' to make it work the same as yours. I also need to make it ignore noise from connecting the serial.
As I mentioned before, the cheap PL2303HX adapters I'm using are very tolerant of timing errors. Even starting at ~4.5 slow (measured on a scope), I get no errors sending from the t13.
@nerdralph I'm experiencing some issues when the ATtiny13 is reading data from the PC at high speeds. For instance, I'm able to write to the PC with 115200 baud @ 4.8 MHz, but I'm not able to read. I'm using James' ReadASCIIString sketch.
Is this normal behaviour?
I'm not too surprised. A couple days ago I was looking at James' mods to RxByte, and noticed they skew the timing of when the received bit is read. I emailed him some suggestions for reducing the skew.
A couple years ago I worked on converting RxByte to be interrupt driven, with a single-character buffer. In addition to making the Rx timing more consistent, it would also allow for implementing Serial.available(). What do you think about doing it that way?
I'm not too surprised. A couple days ago I was looking at James' mods to RxByte, and noticed they skew the timing of when the received bit is read. I emailed him some suggestions for reducing the skew.
So reducing the skew fixed this issue? If so I'd be interested to see if it works on my hardware.
A couple years ago I worked on converting RxByte to be interrupt-driven, with a single-character buffer. In addition to making the Rx timing more consistent, it would also allow for implementing Serial.available(). What do you think about doing it that way?
Interrupt driven RX with a single byte buffer sounds very interesting, especially if we could blend this into James' Arduino wrapper so that Serial.available() would work as expected. Would it be best to use the standard INT0 and not PCINT? After all, I've more or less specified that PB1 is the Rx pin, period. And how about RAM and flash usage? Will it be much worse you think?
I'm not too surprised. A couple days ago I was looking at James' mods to RxByte, and noticed they skew the timing of when the received bit is read. I emailed him some suggestions for reducing the skew.
So reducing the skew fixed this issue? If so I'd be interested to see if it works on my hardware.
I identified the problem from reviewing the code, not from testing.
A couple years ago I worked on converting RxByte to be interrupt-driven, with a single-character buffer. In addition to making the Rx timing more consistent, it would also allow for implementing Serial.available(). What do you think about doing it that way?
Interrupt driven RX with a single byte buffer sounds very interesting, especially if we could blend this into James' Arduino wrapper so that Serial.available() would work as expected. Would it be best to use the standard INT0 and not PCINT? After all, I've more or less specified that PB1 is the Rx pin, period. And how about RAM and flash usage? Will it be much worse you think?
I've actually been thinking about changing the default Rx/Tx pins in BBUart.S. I picked those when the t84 was my preferred AVR, where using PB0/PB1 doesn't interfere with USI and PWM which are on PORTA. On the t13 I've been using PB3 & PB4, or just when doing single-wire Rx/Tx. That leaves PB0/PB1 still free for PWM, and keeps ICSP programming from injecting garbage into your uart. Using PB0/PB1 for the uart has similar issues on the tx5 parts too.
As for RAM and flash, I'm confident I can do it with just a single byte of RAM for the buffer. For extra flash I figure about 20 extra bytes. Even less than that if I have time to fully optimize how the Serial class interfaces with the send & receive asm functions.
I'm not too surprised. A couple days ago I was looking at James' mods to RxByte, and noticed they skew the timing of when the received bit is read. I emailed him some suggestions for reducing the skew. So reducing the skew fixed this issue? If so I'd be interested to see if it works on my hardware. I identified the problem from reviewing the code, not from testing.
Would you mind sharing your discovery so that I can test it? 🙂
I've actually been thinking about changing the default Rx/Tx pins in BBUart.S. I picked those when the t84 was my preferred AVR, where using PB0/PB1 doesn't interfere with USI and PWM which are on PORTA. On the t13 I've been using PB3 & PB4, or just when doing single-wire Rx/Tx. That leaves PB0/PB1 still free for PWM, and keeps ICSP programming from injecting garbage into your uart. Using PB0/PB1 for the uart has similar issues on the tx5 parts too.
Unfortunately, I can't do that with MicroCore. I've designed a development board that I'm planning to sell that uses PB0/PB1. The board is also designed to work with ATtiny25/48/85 using ATTinyCore. PB0/PB1 is used by the Optiboot bootloader on the T85 too. But it shouldn't be a problem to keep the "old" pin style? It would, however, on PB1 be nice if INT0 were used instead of PCINT. the INT0 pin is in use anyways, and this means we don't occupy any PCINT ISR either.
As for RAM and flash, I'm confident I can do it with just a single byte of RAM for the buffer. For extra flash I figure about 20 extra bytes. Even less than that if I have time to fully optimize how the Serial class interfaces with the send & receive asm functions.
Excellent, Looking forward to test it. However, to speed things up a bit I will probably do a release with the current, non-blocking receive function. Then I can play around with the new implementation without having to release it before it's 100% ready.
What's really left for the current implementation is to (maybe, if possible) sort out the "Rx bug" when using high baud rates. I also have a little documentation left.
Here's the details I emailed to James. I think there is a few more cycles of skew added by the non-blocking entry points to RxByte, so the changes I've suggested would just solve the blocking Rx skew.
The mods you made to RxByte will throw off the timing by a few cycles
because the extra instructions will increase the time between
detection of the start bit and sampling the first data bit. There
should be only 2 cycles between GotStartBit: and RxBit:
Instead of using R16, use R19 to save SREG since R18-R27 don't need to
be saved by the called function.
Here's how you can rearrange the code:
RxByte:
in r19, SREG; Save status register
ldi r24, 0x80 ; bit shift counter
sbic UART_Port-2, UART_Rx ; wait for start edge
rjmp RxByte
GotStartBit:
cli
ldi delayArg, RXSTART
RxBit:
You'd need to save SREG and load r24 in your non-blocking entry points
as well. Another option (probably the best one) would be to still
save SREG and load r24 after detecting the start bit, but to change
the RXSTART calculation to compensate for the extra 2 cycles.
#define RXSTARTCOUNT DIVIDE_ROUNDED(RXSTART_CYCLES - 15, 3)
As for the Tx/Rx pins, I'm guessing you don't feel like cutting the traces and adding jumper wires on all the boards you've fabbed? :-)
I think ( untested ) the above commit should address the changes @nerdralph suggested.
I just tested the code on some actual hardware. I'm running at 1.2 MHz (using an external signal generator) and 38400 baud.
Here's the output (inserted string is 1234567890)
What is your name traveler?
Nice to meet you 1⸮⸮⸮⸮6789⸮
Wait, it does work with when LTO is disabled! It actually turns out that both the "old" and "new" code works when LTO is disabled.
EDIT:
Here's some data. Both waveforms show the old implementation (r16 etc). The white one is with LTO disabled, the yellow one enabled. The string I'm sending is 123456. My computer "understands" the while waveform.
Sorry, It's getting late here.. I was probing on the TX output of the T13. When LTO is enabled the space between each character is a little bit less, but it does not affect the result at all. It is the read routine itself that is the problem. The screenshot below is therefore irrelevant in this case.
EDIT2: I tried to compare the "old" and "new" implementation with the scope when LTO was disabled. They look more or less identical, and again, both work when LTO is disabled.
For reference, I'm using a Siglent SDG2042X to generate the clock, and this is how my current setup looks like with a custom-developed AVR 8-pin board (that I might plan to sell in the future).
Yeah, I didn't exactly intend for my wrapper to be used at "high" speeds ;-) Personally, if I could get a t13 mostly reliably working at 9600 I was happy. Given that I hard-coded 9600 as the rate for a 1.2Mhz chip, I may have even found that to be best experimentally ;-)
As an experiment, you could with the recent committed version S file
in r19, SREG
)and in h file put line 78 back to use 13 instead of 15
this of course has the effect of not disabling interrupts in the assembly language read procedure, but since you are using read_str it already disables interrupts there anyway (for the entire read of a string).
That would perhaps get closer to @nerdralph 's original timing.
As an experiment, you could with the recent committed version S file
Didn't help much :/
After a bit more investigation there does not seem to be any issues with @nerdralph's code, but rather the read_str
function. Looking at its source code it seems like there's a lot more going on here that may cause the T13 to be a little too slow. Not sure if it is possible to optimize this function any further. I'm not competent enough for that at least!
You may want to leave this open, as there are some possible improvements to the Rx to free up more time for the user code to process the incoming data. Just having the current Rx code in an ISR won't help much, because it doesn't increase the time available to the main loop. A timer-based Rx sampling interrupt would solve that problem. The interrupt would be short and runs once per bit, making the time that would be wasted busy-looping for the next bit available to the main loop. This worked for a Tx soft uart I wrote last year, and the same concept should work for Rx. https://github.com/nerdralph/nerdralph/blob/master/avr/ISRUART.c
I was thinking about opening a new issue, just to prevent the thread to become too long. This will, after all, be a "new and improved" Rx handler that in my opinion deserves its own topic/thread/issue. Is it OK for you?
That's fine by me.
On Mon, Dec 23, 2019 at 6:50 PM Hans notifications@github.com wrote:
I was thinking about opening a new issue, just to prevent the thread to become too long. This will, after all, be a "new and improved" Rx handler that in my opinion deserves its own topic/thread/issue. Is it OK for you?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MCUdude/MicroCore/issues/88?email_source=notifications&email_token=ABKNZ6WVTQ2LU3QQUM3J6STQ2E6CRA5CNFSM4JZ6HSLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHSCHDI#issuecomment-568599437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKNZ6Q6GZ435QBWJNEP4RLQ2E6CRANCNFSM4JZ6HSLA .
I'm working on wrapping@nerdralph's brilliant bit-banged serial code around the known
Serial.print()
. Or, lazy as I am I'm borrowing @sleemanj's code (Print.h/cpp, HalfDuplexSerial.h/cpp/S) since it's much better than whatever I could have done in terms of memory usage and efficiency.Since this implementation is a bit different from the "official Arduino" one, I'll provide some additional information in the README explaining what works and what does not. Another thing I'd like to show is a table of supported baud rates for different F_CPUs. However, I don't know how to calculate the error. What I want to do is to create a table with all baud rates that is guaranteed to work (with ~3% error or less). I know that the internal oscillator of the T13 is usually very off, but let's pretend this isn't an issue.
Could any of you guys help me fill out this table?EDIT: Tabled filled out by calculating error + testing on real hardware