hytech-racing / MCU

https://hytech-racing.github.io/MCU/index.html
GNU General Public License v3.0
1 stars 0 forks source link

On car testing minimal #45

Closed RCMast3r closed 7 months ago

RCMast3r commented 8 months ago
RCMast3r commented 8 months ago

issues we ran into and fixes we made:

issues that are known right now:

potential next steps:

RCMast3r commented 8 months ago

issues we ran into and fixes we made:

  • initially pin definitions were incorrect for watchdogs and other safety features so car wouldn't latch
  • next, we had issues with SPI as it would get stuck in transferring, this was resolved with moving setting the pinmode of pins into an init function and removing the setting of the pinmodes for the actual SPI pins to let the spi lib control its own pins (except for the clock select pin). The symptom that was seen with this was the ADC would hang in the transferring step of the SPI lib.
  • we introduced an error while debugging when we thought that we could fix what we thought was AMS timeout issue with using a millis() call in the CAN receive function instead of using the systick millis however this caused underflow errors which would look like a really big time diff so it would trigger a watchdog timeout. This was unfucked by switching back to systick millis. The actual cause for the issue of not latching fully was electrical as the AMS itself was not able to fully latch (second contactor) as a component on the TSB was not populated.
  • The next issue we ran into was dealing with inverter startup and malformed CAN msgs being sent to the inverter. This was remedied and / or caused by either un initialized memory in the CAN msg itself for the MC_setpoint_cmd and we were checking the length of a reference instead of a full copy of a msg which we changed to being a full copy of a message for the writing of the CAN msg to the output queue. This issue was confounding with an error in phase wiring as one motor had phases swapped (right front motor).
  • We also switched to rate limiting the send on the inverter interface itself instead of in the drivetrain system with commands. We switched to setting all of the fields verbosely in the setpoints command as well in the inverter interface. Once all of these changes were made the inverters would consistently initialize and get to a state in which they could be commanded
  • The thing we fixed was the scaling of the torque limit being set for the motor command as the inverter would error as soon as command would be sent in ready to drive. This was in an attempt to resolve an issue where as soon as we entered ready to drive the inverter would throw a comms error.
  • the issue that we are currently still unsure if it is fixed or not occurs when in ready to drive. once disabling software ok check for debugging and disabling pedals check to ensure testing of just the ability to command inverters with pedals data we were able to command the inverters using pedals while our testing harness (Serial monitor and usb to CAN adapter connected) was being used. However when this harness is disconnected it seems that we cannot attempt even to enable inverters.
  • We removed print lines in switching into the RTD state as we thought that potentially in outputting debug print we were timing out the comms on the inverter so it would error as the inverter needs constant CAN msgs being sent to it once initialized.

issues that are known right now:

  • buzzer on time seems to be inconsistent

potential next steps:

  • ensure that we still can get to RTD and command while debug harness is attached.
  • if we still can enter RTD and command inverters while debug attached to inverter CAN line and serial monitor to mcu, next is to see if inverters are error while debug not attached
  • next is to see if if with just CAN line being sniffed by seeing if we can init inverters when CAN is being sniffed. if we can init, then CAN line is unstable maybe due to load? O-scope will need to be brought out if so.
  • if CAN line being sniffed doesnt effect behavior, then it is most likely serial print causing issues. to debug this we can just remove all Serial prints and / or hal_prints.
  • if this still doesnt fix the issue and when plugged in we are fine vs not plugged in we arent then it is probably an electrical issue?

adding in @walkermburns 's comments from teams for visibility: " [Monday 12:29 PM] Burns, Walker M This is pretty comprehensive of the issues we have found over the past few days of testing, but to add a few things:

The improper CAN message issue that we discovered is most likely due to uninitialized memory. For those new to C/C++ or used to the arduino compiler, C does not have a lot of the safety features that are present in other languages when it comes to memory. This is part of the reason why it is so fast, but means that it takes some shortcuts that programmers have to be aware of. Namely in this case, the value of variables are not set when they are first initialized in memory. C++ only allocates a block of memory that is the same size as the datatype or class. Because this data is never overwritten, it could be filled with data that was written and erased earlier in the program, or could even be from an earlier run if the processor was not turned off. Our best guess as to why this issue was not experienced to this degree with our old code is that that the Arduino compiler automatically did some 0-initialization of variables, but this is not normal and likely because the arduino IDE is used by a lot of new programmers that may not be aware of the initialization issues. As we swap our code over to platformio, this is something that we need to be especially cognizant of, especially with our CAN library which seems to have some issues with uninited memory.On the hardware side, we had some issues with latching in our first tests. The first issue that we discovered is that our precharge would never go OK. The precharge circuit we have connects the inverters through some large resistors so that we can slowly charge up the capacitors that exist in the inverters without a large inrush current that could cause arcing inside of our relays or contactors and weld them shut. After the capacitors are charged, and the voltage is an acceptable percent of our full accumulator voltage, the precharge will turn off and the contactor will close, giving access to our full pack current through the bus bars. The first problem was just due to a bad mosfet that was not closing the precharge circuit and not allowing the secondary contactor (or AIR - accumulator isolation relay) to close.After replacing the relay, we discovered another issue where the precharge would blink on and off before eventually delatching the car. Noorani, Shayan S can speak more to this, but I believe this was due to an issue with the discharge board. The discharge board works together with the precharge board to enable and disable our HV lines. Not only do we have to slowly charge the inverters, but when the car shuts off, these capacitors may still be charged. The discharge board is a similar board with resistors and a relay that is mounted in the ivnerter enclosure. While the car is off, its relay is normally closed, providing a current path to the car's ground, and keeping the TS/HV at ground potential. When the car's shutdown circuit is activated when starting the car, this relay opens allowing the inverters to charge up. The issue that we had is that the board was not properly grounded, and the relay was not opening meaning that the current from the precharge circuit was being constantly discharged.Also tied to this issue is that our precharge and discharge resistors have been increased in value after review from a FSAE judge. Although I don't fully understand the delay circuitry just yet, our hardware static delay remained unchanged which caused issues with the longer precharge/discharge time. "

RCMast3r commented 8 months ago

issues we ran into and fixes we made:

  • initially pin definitions were incorrect for watchdogs and other safety features so car wouldn't latch
  • next, we had issues with SPI as it would get stuck in transferring, this was resolved with moving setting the pinmode of pins into an init function and removing the setting of the pinmodes for the actual SPI pins to let the spi lib control its own pins (except for the clock select pin). The symptom that was seen with this was the ADC would hang in the transferring step of the SPI lib.
  • we introduced an error while debugging when we thought that we could fix what we thought was AMS timeout issue with using a millis() call in the CAN receive function instead of using the systick millis however this caused underflow errors which would look like a really big time diff so it would trigger a watchdog timeout. This was unfucked by switching back to systick millis. The actual cause for the issue of not latching fully was electrical as the AMS itself was not able to fully latch (second contactor) as a component on the TSB was not populated.
  • The next issue we ran into was dealing with inverter startup and malformed CAN msgs being sent to the inverter. This was remedied and / or caused by either un initialized memory in the CAN msg itself for the MC_setpoint_cmd and we were checking the length of a reference instead of a full copy of a msg which we changed to being a full copy of a message for the writing of the CAN msg to the output queue. This issue was confounding with an error in phase wiring as one motor had phases swapped (right front motor).
  • We also switched to rate limiting the send on the inverter interface itself instead of in the drivetrain system with commands. We switched to setting all of the fields verbosely in the setpoints command as well in the inverter interface. Once all of these changes were made the inverters would consistently initialize and get to a state in which they could be commanded
  • The thing we fixed was the scaling of the torque limit being set for the motor command as the inverter would error as soon as command would be sent in ready to drive. This was in an attempt to resolve an issue where as soon as we entered ready to drive the inverter would throw a comms error.
  • the issue that we are currently still unsure if it is fixed or not occurs when in ready to drive. once disabling software ok check for debugging and disabling pedals check to ensure testing of just the ability to command inverters with pedals data we were able to command the inverters using pedals while our testing harness (Serial monitor and usb to CAN adapter connected) was being used. However when this harness is disconnected it seems that we cannot attempt even to enable inverters.
  • We removed print lines in switching into the RTD state as we thought that potentially in outputting debug print we were timing out the comms on the inverter so it would error as the inverter needs constant CAN msgs being sent to it once initialized.

issues that are known right now:

  • buzzer on time seems to be inconsistent

potential next steps:

  • ensure that we still can get to RTD and command while debug harness is attached.
  • if we still can enter RTD and command inverters while debug attached to inverter CAN line and serial monitor to mcu, next is to see if inverters are error while debug not attached
  • next is to see if if with just CAN line being sniffed by seeing if we can init inverters when CAN is being sniffed. if we can init, then CAN line is unstable maybe due to load? O-scope will need to be brought out if so.
  • if CAN line being sniffed doesnt effect behavior, then it is most likely serial print causing issues. to debug this we can just remove all Serial prints and / or hal_prints.
  • if this still doesnt fix the issue and when plugged in we are fine vs not plugged in we arent then it is probably an electrical issue?

adding in @walkermburns 's comments from teams for visibility: " [Monday 12:29 PM] Burns, Walker M This is pretty comprehensive of the issues we have found over the past few days of testing, but to add a few things:

The improper CAN message issue that we discovered is most likely due to uninitialized memory. For those new to C/C++ or used to the arduino compiler, C does not have a lot of the safety features that are present in other languages when it comes to memory. This is part of the reason why it is so fast, but means that it takes some shortcuts that programmers have to be aware of. Namely in this case, the value of variables are not set when they are first initialized in memory. C++ only allocates a block of memory that is the same size as the datatype or class. Because this data is never overwritten, it could be filled with data that was written and erased earlier in the program, or could even be from an earlier run if the processor was not turned off. Our best guess as to why this issue was not experienced to this degree with our old code is that that the Arduino compiler automatically did some 0-initialization of variables, but this is not normal and likely because the arduino IDE is used by a lot of new programmers that may not be aware of the initialization issues. As we swap our code over to platformio, this is something that we need to be especially cognizant of, especially with our CAN library which seems to have some issues with uninited memory.On the hardware side, we had some issues with latching in our first tests. The first issue that we discovered is that our precharge would never go OK. The precharge circuit we have connects the inverters through some large resistors so that we can slowly charge up the capacitors that exist in the inverters without a large inrush current that could cause arcing inside of our relays or contactors and weld them shut. After the capacitors are charged, and the voltage is an acceptable percent of our full accumulator voltage, the precharge will turn off and the contactor will close, giving access to our full pack current through the bus bars. The first problem was just due to a bad mosfet that was not closing the precharge circuit and not allowing the secondary contactor (or AIR - accumulator isolation relay) to close.After replacing the relay, we discovered another issue where the precharge would blink on and off before eventually delatching the car. Noorani, Shayan S can speak more to this, but I believe this was due to an issue with the discharge board. The discharge board works together with the precharge board to enable and disable our HV lines. Not only do we have to slowly charge the inverters, but when the car shuts off, these capacitors may still be charged. The discharge board is a similar board with resistors and a relay that is mounted in the ivnerter enclosure. While the car is off, its relay is normally closed, providing a current path to the car's ground, and keeping the TS/HV at ground potential. When the car's shutdown circuit is activated when starting the car, this relay opens allowing the inverters to charge up. The issue that we had is that the board was not properly grounded, and the relay was not opening meaning that the current from the precharge circuit was being constantly discharged.Also tied to this issue is that our precharge and discharge resistors have been increased in value after review from a FSAE judge. Although I don't fully understand the delay circuitry just yet, our hardware static delay remained unchanged which caused issues with the longer precharge/discharge time. "

adding in @CL16gtgh 's comment:

" [Monday 6:12 PM] Yang, Cecilia For accuracy of problem statement in the 3rd point in the 1st message, the AMS (Accumulator Management System) itself does not latch, it communicates with the ACU (Accumulator Control Unit) and the latter provides a signal indicating that we're reading information of all cells in the battery pack (AMS heartbeat received). This signal is required by FSAE rules as a digital shutdown signal that controls the latching of a relay on the Main ECU. When heartbeat is not received within 30 second interval, Main ECU would delatch this specific relay. The unpopulated component on the TSB (Tractive System Board) is an N-channel MOSFET that latches the precharge relay, its absence would cause failure to precharge the inverters, and the high side AIR (Accumulator Insolation Relay) would not latch because of that. These are in fact very different parts in the process of latching our vehicle. It might sounds very complex at this point, but they're actually very interesting once you get the hang of it, and I would encourage you guys to learn about it and ask us questions anytime you have any. "