Open dchirgwin opened 4 years ago
Thank you for your report.
The second issue that you reported has already been fixed in the stack some time ago and will be part of the v3.6.0 release. Yes, it was a bug that could cause the currTmo to go negative.
For the 1st issue, the code was originally developed for a tick system where there is no guarantee that the TCP/IP process period is always a multiple of the system tick. For example, the system tick is 2 ms and the required callback period is 5 ms. That's why the rounding up and the calculation will render in this case the rateMs == 3 (instead of 2.5) and stackTaskRate == 6 ticks (i.e. 3 ms). Also, notice the extra call to SYS_TMR_CallbackPeriodicSetRate() to adjust the result to the correct integer of ticks.
For your example though, if the system timer resolution is 1 ms, i.e. SYS_TMR_TickCounterFrequencyGet() returns 1000, the calculation should work without any rounding taking effect: TCPIP_STACK_TICK_RATE = 5, then sysRes = 1000 rateMs = ((1000 5) + 999) / 1000 = 5 stackTaskRate = (5 1000) / 1000 = 5
so the stack TCPIP_STACK_Task() will be called every 5 ms, as intended. Is there an error in the SYS_TMR_TickCounterFrequencyGet(), it doesn't return 1000, or what is the problem?
Thanks for your reply, Adrian. You're correct that SYS_TMR_TickCounterFrequencyGet() doesn't return 1000 in my project. It returns 266, so maybe that's the root of the problem I was seeing.
This is derived from SYS_COUNT_TICK_CONV, as seen in sys_time_h2_adapter.h
#define SYS_COUNT_TICK_CONV 150000
static __inline__ uint32_t __attribute__((always_inline)) SYS_TMR_TickCounterFrequencyGet ( void )
{
return SYS_TIME_FrequencyGet()/SYS_COUNT_TICK_CONV;
}
My SYS_Time_Frequency is 40MHz, hence SYS_TMR_TickCounterFrequency is defined as 40,000,000 / 150,000 = 266
I understand the intention of the rounding, but my concern is that the SYS_TMR_TickCounterFrequency value is not actually related to how the task "ticks" are generated.
SYS_TMR_CallbackPeriodic() does not specify a number of ticks, rather it specifies a number of milliseconds. Therefore, I think that stackTaskRate
should be calculated using milliseconds, not using ticks.
Obviously, if SYS_COUNT_TICK_CONV was 1000 then 1 SYS_TMR tick = 1 millisecond and there would be no difference. But if SYS_COUNT_TICK_CONV is not 1000, this anomaly occurs.
I've just seen what may be the critical line of code:
You mention a call to SYS_TMR_CallbackPeriodicSetRate() to adjust the result. In my generated code, this call is commented out, so the adjustment is never made.
The call to the SYS_TMR_CallbackPeriodicSetRate() is commented out as the H3 timer no longer supports it. So it's the same in my code, no problem here.
There seems to be a disconnect between the setting of the timer and the result of the SYS_TMR_TickCounterFrequencyGet(). It should match the settings on your system. So practically that "hard coded" value SYS_COUNT_TICK_CONV == 150000 is wrong. I'll have to investigate to understand how that value is generated and if it should adjust itself depending on the settings. For now please adjust it manually so that you get the correct timing, until I get an answer.
The issue has been already corrected and it will be part of the next release. The sys_time_h2_adapter should not use the hard coded SYS_COUNT_TICK_CONV but the value returned by the call SYS_TIME_MSToCount(1). Please give it a try, it should fix your issue.
Of course the 2nd issue remains (the one caused by the late initialization of the system timer and wrong values for the module timeouts). I can send you a quick snapshot of the fix, if you want to implement it yourself until the release is available.
Thanks for the update. I'm pleased and impressed you found the "smoking gun" - it's horrendously complex stuff around there! Regarding the fix, I've got some workarounds in place so I'm happy to wait for the next release, but thanks for the offer.
Some of the complexity is caused by the need of signaling of RX/TX and TMO events, and running under an RTOS but also bare metal. Regarding the timing itself, I agree that it should be updated to use directly the H3 timing scheme, ms based, as the tick scheme is no longer present. I'll create an internal issue for this update. It won't be probably part of the next release, but of a future one. Thank you for reporting these issues.
I shall prefix all my other comments by saying that the timing is very complicated, and it's possible that I have misunderstood the intention of the code in places. However, I have used detailed logs from Tracealyzer to analyze this, and I'm fairly sure there is a problem. I'm running on PIC32MX under FreeRTOS.
First Issue
First issue is that
stackTaskRate
is not being calculated properly. The value is set in_TCPIPStackCreateTimer()
based on a complicated rounding calculation involvingSYS_TMR_TickCounterFrequencyGet()
.However, the timer uses
SYS_TMR_CallbackPeriodic()
which is based on a pure millisecond timer, and is not derived from the SYS_TMR_TickCounterFrequency.I am using the following configuration:
Instead of getting a call to
TCPIP_TCP_Task()
every 10ms, I get a call every 7ms (on average). This is becausestackTaskRate
has been set to 7 instead of 5, which means that_TCPIPStackSignalTmo()
is eating away atpSigEntry->currTmo
in larger chunks than it should.Solution
I think the code that sets
stackTaskRate
just needs to be simplified like this:Second (related) Issue
Originally I had a slightly different configuration:
This caused very weird errors because TCPIP_TCP_TASK_TICK_RATE is smaller than
stackTaskRate
, which caused periodic overflows in_TCPIPStackSignalTmo()
aspSigEntry->currTmo
got increasingly negative, and eventually wrapped around to be positive. The end result was that the periodic TCP processsing (e.g. the Nagle algorithm using TCPIP_TCP_AUTO_TRANSMIT_TIMEOUT_VAL) went idle for about 23 seconds every 106 seconds.When the signal handler is installed by
_TCPIPStackSignalHandlerRegister()
there is a check to make sure the period of the signal handlerasyncTmoMs
is not smaller thanstackTaskRate
.The problem is that, due to the order of initialization,
stackTaskRate
has not been initialized yet, so this test always passes!_TCPIPStackSignalHandlerRegister()
is called during initialization of the TCPIP stack, butTCPIP_STACK_CreateTimer()
is called from_TCPIPStackIsRunState()
which is called fromTCPIP_STACK_Task()
, i.e. AFTER initialization.Solution
One way to address this might be to separate the initialization of
stackTaskRate
from the actual starting of the timer.startTaskRate
can be calculated early during intialization, and the timer started fromTCPIP_STACK_Task()
as currently.