ttlappalainen / NMEA2000_mcp

Inherited object for use NMEA2000 library for Arduino Boards with MCP2515 CAN bus controller.
19 stars 18 forks source link

Node with MCP2515 disappears when another devices begins transmitting #5

Open speters opened 5 years ago

speters commented 5 years ago

I have a strange problem with a NMEA2000 device built with STM32F103 "blue pill" and MCP2515 board: The device disappears from the bus when another device sends for the first time.

The device does not re-appear (start sending) when other devices disconnect or stop sending.

Code is taken from the example "MessageSender".

A dump of the NMEA2000 data looks as follows: candump can0 | candump2analyzer | analyzer -raw

INFO 2019-07-18T09:24:53.873Z [analyzer] Assuming normal format with one line per packet
INFO 2019-07-18T09:24:53.873Z [analyzer] New PGN 60928 for device 15 (heap 10383 bytes)
2019-07-18-09:24:53.873 6  15 255  60928 ISO Address Claim:  Unique Number = 0x539; Manufacturer Code = ERROR; Device Instance Lower = 0; Device Instance Upper = 0; Device Function = 132; Device Class = Internetwork device; System Instance = 0; Industry Group = Marine
2019-07-18-09:24:53.873 6 015 255  60928 : 39 05 c0 ff 00 84 32 c0
INFO 2019-07-18T09:24:54.123Z [analyzer] New PGN 126993 for device 15 (heap 10398 bytes)
2019-07-18-09:24:54.124 7  15 255 126993 Heartbeat:  Data transmit offset = 600.00 s; Sequence Counter = 0
2019-07-18-09:24:54.124 7 015 255 126993 : 60 ea 00 ff ff ff ff ff

Now another device is switched on (here an ESP32 with WindMonitor example, but Garmin GPSmap behaves similar):

2019-07-18-09:27:45.261 2  15 255 129026 COG & SOG, Rapid Update:  SID = 1; COG Reference = True; COG = 115.6 deg; SOG = 0.10 m/s
2019-07-18-09:27:45.261 2 015 255 129026 : 01 fc d0 4e 0a 00 ff ff
INFO 2019-07-18T09:27:45.415Z [analyzer] New PGN 60928 for device 23 (heap 21163 bytes)
2019-07-18-09:27:45.415 6  23 255  60928 ISO Address Claim:  Unique Number = 0x1; Manufacturer Code = ERROR; Device Instance Lower = 0; Device Instance Upper = 0; Device Function = 130; Device Class = External Environment; System Instance = 0; Industry Group = Marine
2019-07-18-09:27:45.415 6 023 255  60928 : 01 00 c0 ff 00 82 aa c0
INFO 2019-07-18T09:27:45.665Z [analyzer] New PGN 126993 for device 23 (heap 21178 bytes)
2019-07-18-09:27:45.666 7  23 255 126993 Heartbeat:  Data transmit offset = 600.00 s; Sequence Counter = 0
2019-07-18-09:27:45.666 7 023 255 126993 : 60 ea 00 ff ff ff ff ff
INFO 2019-07-18T09:27:46.415Z [analyzer] New PGN 130306 for device 23 (heap 21193 bytes)
2019-07-18-09:27:46.416 2  23 255 130306 Wind Data:  SID = 1; Wind Speed = 10.30 m/s; Wind Angle = 50.0 deg; Reference = Apparent
2019-07-18-09:27:46.416 2 023 255 130306 : 01 06 04 17 22 02 ff ff
2019-07-18-09:27:47.417 2  23 255 130306 Wind Data:  SID = 1; Wind Speed = 10.30 m/s; Wind Angle = 50.0 deg; Reference = Apparent
2019-07-18-09:27:47.417 2 023 255 130306 : 01 06 04 17 22 02 ff ff

Transmission of the MCP2515 MultiSender device is now stopped and device is not visible to others on the bus.

Debugging the STM32 shows that the device is still live under this condition: I have set 3 breakpoints in _tNMEA2000mcp::InterruptHandler()

  1. if ( (frame=pRxBuffer->GetWriteFrame())!=0 ) {
  2. if ( (status=N2kCAN.checkClearTxStatus(&tempRxTxStatus,N2kCAN.getLastTxBuffer()))!=0 ) {
  3. while ( (status=N2kCAN.checkClearTxStatus(&tempRxTxStatus))!=0 && ...

As soon as another device is active on the bus, only breakpoint 1 kicks in. Code in Breakpoints 2 and 3 are not reached any more, and no sending takes place.

I also tried with NMEA2000.EnableForward(false);, to make sure it is not a problem with blocking output of forwared data, but no change (also USB disabled).

MCP2515 and STM32F103 is an awkward combo, but I had no other STM32 laying around which is capable of both CAN and USB.

It would be great if somebody had some hints or fixes to my problem...

ttlappalainen commented 5 years ago

But what happens on main loop. According you brakepoints no-one puts data to Tx buffer and thats why they have not been fired anymore. This happens if main loop stops.

speters commented 5 years ago

Hi Timo, thx for your kind reply.

I have now set breakpoint in main.cpp (Arduino code) on the for (;;) { loop to observe this:

When other device is attached before, setup() call is run, but the following loop() call does not get executed. Interrupt routine is running, but no incoming data is read.

When other device is attached while running, the loop() call seems to end. Interrupt routine writes last data to bus, then no more data is read. Interrupt routine continues geting called, but nor more data is read.

As the rx buffer of the MCP2515 is filling while stepping with the debugger, I suppose it does not make too much sense in stepping through the code, the stm32 would not catch up with incoming data.

So next try was to do simple "debugging" with GPIO pin toggle (I wish I had more clue about the debugging facilities of the STM32 with ITM and such...) to see how time is spent in interrupt routine (high when entering, low when leaving routine) :

This is only MCP2515 based device (blue channel is GPIO pin, yellow channel shows bursts of SPI-Clk): grafik

And this is how it looks when other devices are attached to NMEA2000 bus: grafik Looks like entire time is spent in interrupt routine, GPIO has virtually no time to go low. On a different horizontal scale, it is visible that one cycle of the tNMEA2000_mcp::InterruptHandler() takes about 52µs, with no delay between cycles (ISR seems retriggered instantly).

Now it is obvious why loop() seems not to run.

This all was tested with up-to-date libraries from Github and with 10MHz SPI Clk as well as 1MHz.

So why does the MCP2515 go wild generating interrupts?

ttlappalainen commented 5 years ago

There may be problem with buffers. MessageSender does not set any receive buffer size, which causes it to use for mcp_can default buffer size 2. Now in interrupt routine there seem to be small feature that if buffer is full, interrupt flag will not be cleared and so it start to loop. To fix interrupt routine, add: if ( (frame=pRxBuffer->GetWriteFrame())!=0 ) { ... } else { // Buffer full, skip frame tCANFrame FrameToSkip; byte ext,rtr; N2kCAN.readMsgBufID(status,&(FrameToSkip.id),&ext,&rtr,&(FrameToSkip.len),FrameToSkip.buf); }

Then it should work. Anyway for any real device you need some receive buffer. You can define that in setup with: NMEA2000.SetN2kCANReceiveFrameBufSize(50); The value depends of your available memory and how much you need to handle. In NMEA2000 you may receive frames in every 0.4 ms so 25 frames/10ms. This means that if you may be away of update 20 ms, you need frame buffer at least 50.

speters commented 5 years ago

Thx for this advice.

I tried out, but the symptoms stay the same. Buffers were set to

  NMEA2000.SetN2kCANMsgBufSize(15);
  NMEA2000.SetN2kCANReceiveFrameBufSize(70);

As soon as second device (e.g. WindMonitor example, which I consider being a low bus load) is attached, controller gets loaded by the interrupt routine, which is re-entered immediately.

Now I forcefully cleared _CANINTF.MCP_RX0IF | CANINTF.MCPRX1IF flags in MCP2515 inside the frame skip routine:

if ( (frame=pRxBuffer->GetWriteFrame())!=0 ) {
...
} else { // Buffer full, skip frame
  tCANFrame FrameToSkip;
  byte ext,rtr;
  N2kCAN.readMsgBufID(status,&(FrameToSkip.id),&ext,&rtr,&(FrameToSkip.len),FrameToSkip.buf);

  N2kCAN.clearBufferReceiveIfFlags(MCP_RX0IF| MCP_RX1IF); // Clear MCP2515 Rx interrupt flags
}

This makes the controller no more locked by the interupt routine. But also no receiving takes place. Rx buffer is filled immediately, so that skipping of frames occur.

I think, I will ditch the experiment with MCP2515. Might not be worth the effort with digging through layers of unknown code, when an easy solution is to simply use different (and equally cheap) hardware.

Thx for your support and for your great work on the NMEA2000 libraries!

mpadwick commented 3 years ago

Hi I believe I'm have the same issue. I do not have a scope but the INT pin measures constantly hi on a multimeter on both sending and receiving MCP2015, and of cause no data being received.

Sending: ESP8266-F with MCP2515 8MHz crystal and tja1050 can transceiver. Running WindMonitor example sketch. Output on the serial port: 1549856 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1550857 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1551858 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1552859 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF

Receiving: Wemos lolin32 with MCP2515 8MHz crystal and tja1050 can transceiver. Running DataDisplay2 example scech. Output on the serial port: n/a

I have tried the setup wing the MCP_shields "receive_check" and "send" sketches,with success. Data was seen being sent over the CAN bus so I know the hardware is OK and wired correctly. I have bubble and triple check termination and wiring. I have a 120 ohm terminator/resistor at each en of the bus. Between CAN_H ans CAN_L I measures 60 ohm so termination should not be an issue.

I have up to date library's from downloaded from https://github.com/ttlappalainen I have only added the flowing lines as documented (of cause the pin is adjusted for the different boards) in the library.

define USE_N2K_CAN 1

define N2k_SPI_CS_PIN 17

define N2k_CAN_INT_PIN 16

define USE_MCP_CAN_CLOCK_SET 8

Now hear is the kicker I have had this working intermittently with NMEA2000 messages but for the past 3 moths I have not bean able to get a single NMEA2000 message thou. Regular CAN works fine.

Any help or advice would be greatly appreciated.

ttlappalainen commented 3 years ago

Do you have those defines before #include ?

Have you added NMEA2000.SetN2kCANReceiveFrameBufSize(70); before NMEA2000.Open();?

Try to leave interrupt pin undefined to work without interrupt. So comment #define N2k_CAN_INT_PIN 16

mpadwick commented 3 years ago

Thanks ttlappalainen for the quick reply.

Yes, the defines are before #include Adding NMEA2000.SetN2kCANReceiveFrameBufSize(70); before NMEA2000.Open(); mad no difference. Nor did commenting out #define N2k_CAN_INT_PIN 16 on the receiving ESP32 with MCP shield.

However, I took a trip down to the boat today and just for kicks and giggles I connected my ESPs expecting no positive out come.

What I connected to my Seatalk NG network: ESP8266-F with MCP2515 8MHz crystal and tja1050 can transceiver. Running WindMonitor example sketch.

Wemos lolin32 with MCP2515 8MHz crystal and tja1050 can transceiver. Running MessageSender example sketch.

Wemos lolin32 with MCP2515 8MHz crystal and tja1050 can transceiver. Running DataDisplay2 example sketch.

My Raymarine MDF and ST60 instruments all picked up the data being send from the WindMonitor and MessageSender example sketches. But DataDisplay2 example sketch picked up nothing. Nuthing from WindMonitor, MessageSender or any of the Raymarine equipment transmitting on the network.

So, it is definitely something on the receiving side. I'm going back to the boat tomorrow to see if the the WindMonitor and MessageSender sketches stop working when adding NMEA2000.SetMode(tNMEA2000::N2km_ListenAndNode,);

I have also tryed adding NMEA2000.SetMode(tNMEA2000::N2km_ListenOnly,32); before NMEA2000.Open(); on the DataDisplay2 example sketch, this made no difference.

I'm defensibly running low on ides.

ttlappalainen commented 3 years ago

Why do you use MCP2515 with Wemos lolin32? It is based on ESP32-WROOM and has internal CAN controller. Then you just need MCP2562 or ISO1050 for isolated systems.

Have you tried to enable bus data printing on WindMonitor with NMEA2000.SetForwardStream(&Serial); NMEA2000.SetForwardType(tNMEA2000::fwdt_Text); NMEA2000.EnableForward(true);

I do not use mcp anymore in my devices. The ones I used has 16MHz chrystal and never had any problems with them. People have lots of problems with these 8MHz chrystals - do not know why, since it shoud not make any difference.

mpadwick commented 3 years ago

I'm using the MCP2515 with the lolin32 as that just happened to be what i had.

Yes, I have tried to enable bus data printing on WindMonitor. It prints out the PNG nicely. Sending data to the CAN network (using NMEA2000 PGNs) works nicely. My MFD can see all data that I send using the MCP2515. It is receiving that is the issue, or rather receiving when using NMEA2000_MCP.h. When using MCP_CAN library's examples receiving works nicely.

This weekend I tried merging NMEA2000ToWiFiAsSeaSmart and WindMonitor, so that NMEA2000ToWiFiAsSeaSmart not just listened but also sent something to the CAN bus. The MFD picked up the wind data being sent, but NMEA2000ToWiFiAsSeaSmart received nothing from the network (on the network there is tempter sensor, depth sensor, AIS, speed sensor, and more). This proves at least that the TX part of MCP2515 is working as expected.

I've been scratching my head all day and trying to understand the layers. To me it seems that NMEA2000_MCP.h is a driver that calls functions in the class of MCP_CAN library. Witch makes my very novis brane wonder why the added layer of NMEA2000_MCP.h?

I've been reading the data sheet for the MCP2515. If I have understood the information correctly the crystal is only used as a synchronization clock on incoming messages. If that is the case can there be something wrong in the driver MCP_CAN.h NMEA2000_MCP.h or the crystal it self. However other then setting the can clock, I do not see how NMEA2000_MCP.h handles 8MHz crystals differently from 16MHz crystals. So I do not think the problems resides hear.

Then as the MCP_CAN library's examples receiving works nicely, that would indicate that the crystal is working to.

I'v tryed to use MCP_CAN.h from Seeed-Studio with the same result.

But the MCP_CAN.h from ttlappalainen must work with 8MHz crystals @ 250 KB/s as I can receive when using the example sketches from that library.

I'm so puzzled!!!

Is it worth to try with 16 MHz crystal. From what I have managed to read the crystal should be tuned to the desired baud rate. I have not been able to find any recommended frequency for MCP2515 using 250 KB/s that NMEA 2000 uses.

ttlappalainen commented 3 years ago

Does WindMonitor print also received data or only sent data?

Alo take care that you do not have any other mcp_can library installed at same time. Only the one from my git.

As I said I do not use mcp_can anymore and have never tested it with ESP32, since it has its own internal CAN. I prefer to use ESP32 internal CAN ot Teensies.

mpadwick commented 3 years ago

WindMonitor only prints data like below: 1549856 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1550857 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1551858 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF 1552859 : Pri:2 PGN:130306 Source:23 Dest:255 Len:8 Data:1,6,4,17,22,2,FF,FF

So it dose not look like it receives ether. I'm quit sure that I only have your library for MCP2515. Just to be sure I'll remove all library's and start over.

ttlappalainen commented 3 years ago

I hooked up my MCP2515+MCP2551 to ESP32. I feed it with 5V and added voltage divider to MISO line. I did not add interrupt line, since it was messy enough. So I have

define USE_N2K_CAN 1

define N2k_SPI_CS_PIN 5

include

At the beginning and it seem to catch messages fine.

mpadwick commented 3 years ago

Hi Timo Thank for you patients and efforts to help me. I have made some progress. My MCP shields have the tja1050 chip where yours has the mcp2551 as you stated. Today I tried replacing the crystal from a 8 MHz to a 16MHz. This made no difference. A then solder on wires on to the tja1050's RX and TX pins and connected them to the ESP32. After downloading the NMEA2000_esp library and recompiling. I started to receive PNGs from WindMonitor (uses a MCP2515 with a 16MHz crystal) and MessageSender (uses a MCP2515 with a 8MHz crystal). So this proves that sending at least works with both crystal frequency's. Unfortunately I did not consistently receive PNGs across power cycles of the board running DataDisplay2, but still something. Some times I received nothing other times I got a hand full of PNGs before the tja1050 stopped receiving. The only consistent behavior was that the board running DataDisplay2 stopped receiving after about a second.

So ether the tja1050 is not compatible with the NMEA protocol or I have counterfeit chips on my MCP2515. I will replace the tja1050 with mcp2551 chips and hop for the best.

Once again a massive thank you for you time and leased an excellent library.

ttlappalainen commented 3 years ago

You can not use MCP2551, since it is 5V device and ESP:s can only handle 3.3V. Voltage deviders for inputs is just fors testing not for permanent solution. You can use MSC2562, which has one pin for io level definition.

mpadwick commented 3 years ago

Hi Got a quick update on my issues with the MCP2551 module. I got the module running with this code below, before the line #include <NMEA2000_CAN.h>. #define USE_N2K_CAN 1 // Force mcp_can #define N2k_SPI_CS_PIN 17 // Pin for SPI CAN Select #define USE_MCP_CAN_CLOCK_SET 8 // possible values 8 for 8Mhz and 16 for 16 Mhz clock

But it only worked with NMEA2000.SetMode(tNMEA2000::N2km_ListenOnly,NodeAddress);. The drawback are, it dose not announce it's self on the NMEA2k bus, nor can I't send data to the NMEA2k bus. If I used NMEA2000.SetMode(tNMEA2000::N2km_ListenAndNode,NodeAddress); the device dropped of the NMEA2000 bus almost immediately.

After a lot of digging on the github issues pages I found the magic line hiding in a dark corner, to get ListenAndNode to work with my cheap china MCP2515 module. After adding this section to the top of my code, it all works nicely. #define USE_N2K_CAN 1 // Force mcp_can #define N2k_SPI_CS_PIN 17 // Pin for SPI CAN Select #define USE_MCP_CAN_CLOCK_SET 8 // possible values 8 for 8Mhz and 16 for 16 Mhz clock #define MCP_CAN_RX_BUFFER_SIZE 50

Thanks Timo for an excellent peace library.

ttlappalainen commented 3 years ago

I prefer to also use interrupt, which will automatically fill rx buffer, even you would have some delay on polling.

define N2k_CAN_INT_PIN