Open mariopaja opened 8 months ago
CC @decsny @lmajewski This driver seems to have been written by @lmajewski but he does not seem to be a Zephyr collaborator so I cannot assign it to him
If you can narrow down if it is an LPSPI issue or LAN865 issue that would be helpful because I don't have a LAN865, and the author of the driver most likely does not have a platform with LPSPI. Since you said it works on other platforms, maybe you can compare the SPI signals with a signal analyzer between those platforms and the RT1064 to see where is the difference as your next step in debugging.
@mariopaja - I only had the Nucleo Devel board for T1S development. I've tried to use the Zephyr kernel's well defined APIs, so I assume that semaphores shall work in the same way cross platform.
From the above comment - XIAO ESP32C3
is a RISC-V SoC, so it looks like at least on other SoC it works.
Could you share which HW revision of LAN8651 do your have? (B0 or B1) With B0 for example I had issues with SW reset - hence the GPIO reset is used (and recommended). As advised by @decsny - comparing the SPI signals would be a good starting point.
Just a side note - I shall have come back to this driver in short time, as the full duplex transmission needs to be finished (now it is a very simple, working half-duplex driver).
@lmajewski
with RISK-V yes. I would definitely have to do a second test round with all the MCUs. We are using B0 variant. And btw thank you a lot for the support of this driver :)
@decsny
I am also suspicious on the driver itself because I have faced issues with the chip select, and there are other developers that are mentioning that they are facing difficulties like https://github.com/zephyrproject-rtos/zephyr/discussions/69692#discussion-6300901.
These made me a bit suspicious and confused because these two issues are not directly related to each other. I will be able to provide the SPI signals for STM, NXP, & ESP32-S3/C3 by the end of the next working week.
I am also suspicious on the driver itself because I have faced issues with the chip select, and there are other developers that are mentioning that they are facing difficulties like #69692 (comment).
There is this issue with the LPSPI CS tracked here: https://github.com/zephyrproject-rtos/zephyr/issues/16544 , maybe consider if this is a cause
@lmajewski lan865x driver seems to be fine. I was able to make it work with also with xiao_esp32s3 (XTENSA).
@decsny I checked the rt1170 and the signal seems kind of inverted. Maybe it is not a bug, maybe I have configured the SPI wrong :/ At the moment I am using the following configuration:
&lpspi1 {
status = "okay";
cs-gpios = <&gpio3 28 GPIO_ACTIVE_LOW>; /* D10 */
eth0: lan865x@0{
status = "okay";
compatible = "microchip,lan865x";
reg = <0>;
int-gpios = <&gpio3 6 GPIO_ACTIVE_LOW>; /* D9 */
rst-gpios = <&gpio3 0 GPIO_ACTIVE_LOW>; /* D8 */
local-mac-address = [00 19 05 00 00 04];
spi-max-frequency = <20000000>;
plca-enable;
plca-node-id = <3>;
plca-node-count = <8>;
plca-burst-count = <0x0>;
plca-burst-timer = <0x80>;
plca-to-timer = <0x20>;
};
};
@lmajewski
I was able to point the error to: https://github.com/zephyrproject-rtos/zephyr/blob/ca527d9f12bceb23d8ace958233edcdd63de3db7/drivers/ethernet/oa_tc6.c#L309
before getting the following error: spi_mcux_lpspi: Transfer could not start on spi@40394000: 4
and is is either transmitting the first 3 bytes of the header and then the SPI stops transmitting or the SPI doesnt start the transmittion at all like the image below:
@decsny It also seems like the SPI is transmitting 16 bytes at a time without any delay between each byte transmission (image below). Maybe this is creating a bottleneck.
Is there a possibility to lower the transmission speed between the bytes and make it look more like the behaviour of STM32 H563 (image below)?
@lmajewski for further testing I added the following code to https://github.com/zephyrproject-rtos/zephyr/blob/ca527d9f12bceb23d8ace958233edcdd63de3db7/drivers/ethernet/oa_tc6.c#L272 and only then I was able to transmit the spi chunk and deassert the interrupt.
...
#ifdef CONFIG_BOARD_MIMXRT1064_EVK
int ret;
uint8_t tx_cmd[OA_TC6_HDR_SIZE+64]={0};
hdr = sys_cpu_to_be32(hdr);
//LOG_ERR("hdr: %x", hdr);
//tx_cmd[0]=hdr[0];
tx_cmd[3]=(hdr >> 24) & 0xFF;
tx_cmd[2]=(hdr >> 16) & 0xFF;
tx_cmd[1]=(hdr >> 8) & 0xFF;
tx_cmd[0]=(hdr) & 0xFF;
if(buf_tx!=NULL){
//LOG_ERR("buf_tx!=NULL");
//TODO: Do smth
for(int i=4;i<68;i++){
//LOG_ERR("buf_tx: 0x%x",*buf_tx);
tx_cmd[i]=*buf_tx;
buf_tx++;
}
}
struct spi_buf tx_buf={.buf = &tx_cmd, .len = ARRAY_SIZE(tx_cmd),};
struct spi_buf_set tx = {.buffers = &tx_buf, .count = 1,};
uint8_t rx_cmd[64+OA_TC6_FTR_SIZE]={0};
if(buf_rx!=NULL){
LOG_ERR("buf_rx!=NULL");
//TODO: Do smth
}
const struct spi_buf rx_buf = {.buf = rx_cmd, .len = ARRAY_SIZE(rx_cmd),};
const struct spi_buf_set rx = {.buffers = &rx_buf, .count = 1,};
ret = spi_transceive_dt(tc6->spi, &tx, &rx);
if (ret < 0) {
return ret;
}
//LOG_ERR("rx_cmd[64]: %x", rx_cmd[64]);
uint32_t data;
data=rx_cmd[67];
data|=rx_cmd[66]<<8;
data|=rx_cmd[65]<<16;
data|=rx_cmd[64]<<24;
*ftr=data;
#else
...
@lmajewski I finished the patch today for the rt1064 and I could get the expected behaviour and transmit & recevie packets.
@decsny Is there any particular reason why it would work with one and not with the other?
int oa_tc6_chunk_spi_transfer(struct oa_tc6 *tc6, uint8_t *buf_rx, uint8_t *buf_tx, uint32_t hdr,
uint32_t *ftr)
{
// Correct behaviour
#ifdef CONFIG_BOARD_MIMXRT1064_EVK
int ret;
uint8_t tx_cmd[OA_TC6_HDR_SIZE + 64] = {0};
hdr = sys_cpu_to_be32(hdr);
// LOG_ERR("hdr: %x", hdr);
// tx_cmd[0]=hdr[0];
tx_cmd[3] = (hdr >> 24) & 0xFF;
tx_cmd[2] = (hdr >> 16) & 0xFF;
tx_cmd[1] = (hdr >> 8) & 0xFF;
tx_cmd[0] = (hdr) & 0xFF;
if (buf_tx != NULL) {
for (int i = 4; i < 68; i++) {
tx_cmd[i] = *buf_tx;
buf_tx++;
}
}
struct spi_buf tx_buf = {
.buf = &tx_cmd,
.len = ARRAY_SIZE(tx_cmd),
};
struct spi_buf_set tx = {
.buffers = &tx_buf,
.count = 1,
};
uint8_t rx_cmd[64 + OA_TC6_FTR_SIZE] = {0};
const struct spi_buf rx_buf = {
.buf = rx_cmd,
.len = ARRAY_SIZE(rx_cmd),
};
const struct spi_buf_set rx = {
.buffers = &rx_buf,
.count = 1,
};
ret = spi_transceive_dt(tc6->spi, &tx, &rx);
if (ret < 0) {
return ret;
}
for (int i = 0; i < 64; i++) {
*buf_rx = rx_cmd[i];
buf_rx++;
}
uint32_t data;
data = rx_cmd[67];
data |= rx_cmd[66] << 8;
data |= rx_cmd[65] << 16;
data |= rx_cmd[64] << 24;
*ftr = data;
#else
struct spi_buf tx_buf[2];
struct spi_buf rx_buf[2];
struct spi_buf_set tx;
struct spi_buf_set rx;
int ret;
hdr = sys_cpu_to_be32(hdr);
tx_buf[0].buf = &hdr;
tx_buf[0].len = sizeof(hdr);
LOG_ERR("sizeof(hdr): %d", sizeof(hdr));
tx_buf[1].buf = buf_tx;
tx_buf[1].len = tc6->cps;
LOG_ERR("tc6->cps: %d", tc6->cps);
tx.buffers = tx_buf;
tx.count = ARRAY_SIZE(tx_buf);
rx_buf[0].buf = buf_rx;
rx_buf[0].len = tc6->cps;
rx_buf[1].buf = ftr;
rx_buf[1].len = sizeof(*ftr);
rx.buffers = rx_buf;
rx.count = ARRAY_SIZE(rx_buf);
ret = spi_transceive_dt(tc6->spi, &tx, &rx);
if (ret < 0) {
return ret;
}
*ftr = sys_be32_to_cpu(*ftr);
#endif
return oa_tc6_update_status(tc6, *ftr);
}
@mariopaja - Am I correct, that you need to delay the transmission start between sending first byte and the CS being low? As well as your's CPU (to talk with LAN965x) requires some time between sending each byte?
Is there a possibility to lower the transmission speed between the bytes
The SPI transmission shall not impact the TC6 driver - it shall be tuned in the CPU setup code. It looks like the issue is either with the way LAN8651 is communicating via SPI or with the NXP's RT1170 HAL's SPI driver.
Could you check if LAN8651 documentation has any requirements for it? Maybe you's SPI setup (like polarity or CLK frequency) is different than that required?
Could you also check if lowering the SPI frequency helps with the communication stability on your setup?
@lmajewski I think it has to do with the MCUX SPI HAL, because the issue arises as TC6 tries to transmit 4 bytes of the header.
It looks like the issue is either with the way LAN8651 is communicating via SPI or with the NXP's RT1170 HAL's SPI driver.
For this reason I circumvented this issue with the patch here: https://github.com/zephyrproject-rtos/zephyr/issues/69658#issuecomment-2048975684. Instead of providing two buffers at different lengths I give one big buffer to SPI. As a solution it may be enough for our case but it is at no means a good solution.
@mariopaja - I do have an impression that the patch you mentioned would we OK only for some short time.... In the long term IMHO it would be better to have this issue investigated and correctly fixed in the HAL. Maybe NXP's maintainers (especially for HAL) shall be informed about the issue? Maybe they could look on this problem?
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
@mariopaja I have been looking into the lpspi driver lately because of various other issues, and I have a feeling they are related to your issue with this. For example, the chip select is not remaining asserted throughout the whole transfer, that could be the issue, or a host of other ones that I am finding. Basically, the lpspi driver is not even implementing the zephyr api correctly in my opinion at all. I am doing a major rework of this driver that probably will not show up until after LTS because it is too major of a change for this current release cycle.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
We have been testing the LAN865x driver and we have had problems with RT1064 and RT1170.
Our main focus was NXP RT1064 but we were not able to implement LAN865x and therefore ended up testing it with several MCU to see what is going wrong.
From the table below it seems that LAN865x works totally fine with other ARM, RISK-V & XTENSA MCUs but not with RT1064 & RT1170
XIAO ESP32C3
XIAO ESP32S3
NUCLEO H563
RT1170 LAN865x
RT1064
Additionally a test with W5500 module RT1170 W5500