PubInv / NASA-COG

A control system for a highly reliable ceramic oxygen concentrator developed by NASA
GNU Affero General Public License v3.0
1 stars 3 forks source link

Create a System for Internally Logging Last 10 minutes of Critical Parameters on Fault, Firmware, Task 2 #332

Closed RobertLRead closed 1 year ago

RobertLRead commented 1 year ago

We need to develop a system of logging all the main parameters (those which are already reported) for the last 10 minutes at the rate of 1 record per second.

This is 600 logs of about 10 parameters; it should not be a memory burden. It should of course be implemented as a ring buffer which is statically allocated in memory.

When a critical fault or an emergency shutdown occurs, we want to dump this 10 minutes of records into the logs (the Network logs and the Serial log). This should allow fine-grained post-mortem analysis.

I believe the current code and "reporting" mechanism will support this nicely. However, we will definitely have to make sure that reporting is done a separate schedule that we can control. We need to write to the in-memory log once per second, rain or shine.

ForrestErickson commented 1 year ago

Rename to reflect status as a firmware Task 2. This task is dependent on PubInv/NASA-MCOG#36

Further Requirements

Critical parameters:

V1.1 can detect. And some we can not detect and might stub.

  1. Heater Temperature
  2. Gitter Temperature
  3. Stack Temperature
  4. Set PWM
  5. FG (Tachometer)
  6. Measured PS Voltage
  7. Measure PS Current
  8. Air Flow (If we had an air flow sensor)
  9. Ambient Temperature (If we had an ambient temperature sensor)
  10. O2 Pressure (If we had a pressure sensor)
  11. O2 Purity (If we had an O2 concentration sensor)

Faults:

Possible faults V1.1 can detect. And some we can not detect and might stub.

  1. Over current on stack (more than programed)
  2. Over voltage on stack (more than programed)
  3. Under current on stack (less than programed)
  4. Under voltage on stack (less than programed)
  5. Under Voltage on +24 In (SENSE_24V)
  6. Under Voltage on +12 In (SENSE_12V)
  7. Failure (under voltage) of AUX1 supply on Power Supply 1, (SENSE_AUX1)
  8. Failure (under voltage) of AUX2 supply on Power Supply 2 (SENSE_AUX2)
  9. Thermocouple0 Heate,r Failures, Open, short to gnd, short to VCC, No communication.
  10. Thermocouple1 Gitter, Failures, Open, short to gnd, short to VCC, No communication.
  11. Thermocouple2 Stack, Failures, Open, short to gnd, short to VCC, No communication.
  12. Blower On Failure
  13. Blower Off Failure
  14. Over pressure, O2
  15. Under pressure O2
  16. O2 purity bad
LokiMetaSmith commented 1 year ago

finished initial cut of this commit on the branch log_recorder

it appears that creating 600 records of MachineStatusReport which has 15 floats and 1 long was too memory intensive

https://forum.arduino.cc/t/compile-cpp-elf-section-bss-is-not-within-region-ram/259981

LokiMetaSmith commented 1 year ago

So, the issue is a classic time series compression problem, where we have a series of variables that vary over time but they can both vary wildly or not at all and we can't know before hand.

https://www.timescale.com/blog/time-series-compression-algorithms-explained/

One problem is that we immediately convert all parameters to float, which is notoriously difficult to compress.

I found one high quality library that appears to have the appropriate functions for running a time series compression algorithm, without having to write a library ourselves. https://www.arduino.cc/reference/en/libraries/serie/

Another option would be to refactor all of the code to use integers as opposed to floats, however I don't see this as practical

I'll create a test function that creates a moving window of real time logs, "say 10-30" and then compress those into a polynomial representation for storage. There will be artefacts in any compression algorithm, however I consider that acceptable in this use case.

LokiMetaSmith commented 1 year ago

Arduino due has 96KB SRAM, 512KB flash and using an estimated 26KB of ram for the current buffer of 600 records with each record including 16 entries 4Bytes each

LokiMetaSmith commented 1 year ago

Before refactoring RAM: [========= ] 93.9% (used 92328 bytes from 98304 bytes) Flash: [== ] 16.7% (used 87780 bytes from 524288 bytes)

Using arduino F macro

RAM: [========= ] 93.9% (used 92328 bytes from 98304 bytes) Flash: [== ] 16.8% (used 87836 bytes from 524288 bytes)

Literally no difference, and over an hour's worth of refactoring.

RobertLRead commented 1 year ago

??? What refactoring did you do?

On Wed, Oct 25, 2023 at 7:31 PM Lawrence R Kincheloe III < @.***> wrote:

Before refactoring RAM: [========= ] 93.9% (used 92328 bytes from 98304 bytes) Flash: [== ] 16.7% (used 87780 bytes from 524288 bytes)

Using arduino F macro

RAM: [========= ] 93.9% (used 92328 bytes from 98304 bytes) Flash: [== ] 16.8% (used 87836 bytes from 524288 bytes)

Literally no difference, and over an hour's worth of refactoring.

— Reply to this email directly, view it on GitHub https://github.com/PubInv/NASA-COG/issues/332#issuecomment-1780237910, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABINEHYAJPFNTBCQMX6DL3TYBGVMXAVCNFSM6AAAAAA53EA54OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQGIZTOOJRGA . You are receiving this because you were assigned.Message ID: @.***>

-- Robert L. Read, PhD Twitter: @RobertLeeRead @pubinvention Public Invention: https://www.pubinv.org Join Our Mailing list: @. YouTube: https://www.youtube.com/channel/UCJQg_dkDY3KTP1ybugYwReg Medium: @.

LokiMetaSmith commented 1 year ago

So, the CLI pio has a weird and wonderful feature.

Type "pio home" and it starts a locally hosted web server and opens up all of the development tools usually found in the vscode platformio tool.

On the home tab, go to "Open Project" and open up the NASA-COG firmware directory where platformio.ini resides image

Open up the Inspect tab, and select the environment your building against. click "Inspect" and wait for the results image

And clarity!

image

NOTE: these are the largest objects in the program 64 KB packetBufferpacketBuffer 20.1 KB machineConfigmachineConfig 5.2 KB _svfprintf_r_svfprintf_r 3.8 KB _strtod_l_strtod_l 3.5 KB _dtoa_r_dtoa_r

LokiMetaSmith commented 1 year ago

in network_udp.cpp, this buffer consumes 2/3 of all available ram // buffers for receiving and sending data

define buffMax 64*1024

LokiMetaSmith commented 1 year ago

changing

define buffMax 64*1024

to

define buffMax 32*1024

in network_udp.cpp,

and changing MAX_RECORDS = 300; to MAX_RECORDS = 600; in machine.h

should resolve the issue. @gmulligan can you comment on how much the buffMax buffer can be reduced while not affecting functionality?

LokiMetaSmith commented 1 year ago

Re: "??? What refactoring did you do?" @RobertLRead review the file diff and change log on the git branch "log_recorder"

LokiMetaSmith commented 1 year ago

#

has further integrations of log recording