Closed timburke closed 10 years ago
Good catch! I will track this down as soon as I'm back at a computer. On Jun 18, 2014 9:24 AM, "timburke" notifications@github.com wrote:
I've reproduced the reporting issue somewhat reliably and narrowed the problem down to construct_report. It appears to be corrupting memory, as can be seen from the following system log:
Debug (2:12:38) Constructing report... Critical (2:12:45) Taskloop full, resetting the PIC. Critical (2:12:45) 0 Critical (2:12:45) 0 Critical (2:12:45) 0 Critical (2:12:45) Device reset.
The three numbers that are logged should be the start pointer, end pointer and queue length from the taskqueue. They are all somehow zero. This causes the task queue to think that it's full and reset the device. Many other things are probably also being overwritten.
Replacing construct report with a dummy version that just base64 encodes the header without using the aggregate functionality does seems to make the problem go away.
Note that the taskloop full log entry comes immediately after 'constructing report'. This indicates that the call to construct_report does not complete successfully since the code path after it calls another log routine in the case of both success and failure.
— Reply to this email directly or view it on GitHub https://github.com/WellDone/MoMo-Firmware/issues/101.
Updates to this. There are multiple issues that I am in the process of fixing:
Task loop contents (ringbuffer size: 32)
12702: _flush_log
13066: _bus_slave_callback
10956: _scheduler_callback x 30
Tim, can you share the branch you're working on with me so I can collaborate?
On Sat, Jun 21, 2014 at 8:46 PM, timburke notifications@github.com wrote:
Updates to this. There are multiple issues that I am in the process of fixing:
- FIXED RTCC stops when running on the external crystal for an unknown reason (fix is to switch to the internal RC oscillator
- FIXED When a single task takes many seconds, the RTCC interrupt (which triggers at half second intervals when momo config is waiting to be flushed to flash) will fill up the taskloop with the same callback. See taskloop contents below:
Task loop contents (ringbuffer size: 32) 12702: _flush_log 13066: _bus_slave_callback 10956: _scheduler_callback x 30
- construct_report has an algorithmic complexity problem and takes > 9 seconds to run when there are many events to process. This is due to the slow timestamp calculation.
- construct_report also corrupts memory. I have rewritten it to not need pointer math and to use a faster timestamp generation routine but these changes need to be tested before I open a PR.
— Reply to this email directly or view it on GitHub https://github.com/WellDone/MoMo-Firmware/issues/101#issuecomment-46760060 .
Austin McGee austinwm@gmail.com (510) 200-8486
Created a PR #98 to share the branch On 6/23/14, 12:45 AM, Austin McGee wrote:
Tim, can you share the branch you're working on with me so I can collaborate?
On Sat, Jun 21, 2014 at 8:46 PM, timburke notifications@github.com wrote:
Updates to this. There are multiple issues that I am in the process of fixing:
- FIXED RTCC stops when running on the external crystal for an unknown reason (fix is to switch to the internal RC oscillator
- FIXED When a single task takes many seconds, the RTCC interrupt (which triggers at half second intervals when momo config is waiting to be flushed to flash) will fill up the taskloop with the same callback. See taskloop contents below:
Task loop contents (ringbuffer size: 32) 12702: _flush_log 13066: _bus_slave_callback 10956: _scheduler_callback x 30
- construct_report has an algorithmic complexity problem and takes > 9 seconds to run when there are many events to process. This is due to the slow timestamp calculation.
- construct_report also corrupts memory. I have rewritten it to not need pointer math and to use a faster timestamp generation routine but these changes need to be tested before I open a PR.
— Reply to this email directly or view it on GitHub
https://github.com/WellDone/MoMo-Firmware/issues/101#issuecomment-46760060 .
Austin McGee austinwm@gmail.com (510) 200-8486
— Reply to this email directly or view it on GitHub https://github.com/WellDone/MoMo-Firmware/issues/101#issuecomment-46814584.
This problem has been fixed and integrated into dev.
I've reproduced the reporting issue somewhat reliably and narrowed the problem down to construct_report. It appears to be corrupting memory, as can be seen from the following system log:
The three numbers that are logged should be the start pointer, end pointer and queue length from the taskqueue. They are all somehow zero. This causes the task queue to think that it's full and reset the device. Many other things are probably also being overwritten.
Replacing construct report with a dummy version that just base64 encodes the header without using the aggregate functionality does seems to make the problem go away.
Note that the taskloop full log entry comes immediately after 'constructing report'. This indicates that the call to construct_report does not complete successfully since the code path after it calls another log routine in the case of both success and failure.