Open satva opened 5 years ago
Hello. You're going to have to give a bit more specific information. I don't recognize the output messages with asterisks -- I presume these are printk
s you've added? If so, what do the numbers mean. Some other helpful things would be to configure the kernel with timestamps on printk and full debug information as well as all symbols (if you 're board has the memory). I hope this is easily reproducable and changing those doesn't "fix" it (meaning it's undefined behaviour somewhere).
You have either changed the sources or are using a commit other than the HEAD because I don't have a BUG_ON on line 1105 of mcp2210.h (I just pushed a change, but that's in the userspace tool.)
At the very least, I need to know the real line of code where it blew up and you've modified the driver. I suggest you push your modifications to a public repo somewhere if you want me to try to help.
OK, so I have a sinking suspicion that you've forgotten to initialize the tx_buf
of your struct spi_transfer
. I just pushed a commit that better validates this -- I had validation prior, but it was only enabled if CONFIG_MCP2210_DEBUG
was enabled and it was wrong anyway, because it only failed when both tx and rx buffers were NULL.
Do a pull and give it a try. You should get a BUG_ON(!xfer->tx_buf)
in queue_msg
this time. Please let me know.
Thanks Daniel for the response.
Exact line of code triggering the K crash :
printk(KERN_ERR "******%s,%d,%d,%d,CMD:%x******\n",__FUNCTION__,__LINE__,body_size,sizeof(body_size),cmd);
if(__builtin_constant_p(body_size))
BUILD_BUG_ON(body_size > sizeof(msg->body.raw));
else
BUG_ON(body_size > sizeof(msg->body.raw));
if (body){
printk(KERN_ERR "******%s,%d,%d,%d******\n",__FUNCTION__,__LINE__,body_size,sizeof(body_size));
memcpy(msg->body.raw, body, body_size);
}
else if (body_size){
printk(KERN_ERR "******%s,%d,%d,%d******\n",__FUNCTION__,__LINE__,body_size,sizeof(body_size));
BUG();
printk(KERN_ERR "******%s,%d******\n",__FUNCTION__,__LINE__);
}
BUG() was getting triggered in the "else if (body_size)" condition happening in mcp2210_init_msg() This was happening even with CONFIG_MCP2210_DEBUG=y (& CONFIG_MCP2210=m)
Please let me know if I need to provide any more information !
Thanks for the information. Please do a git fetch and rebase your changes on top of the new upstream master branch. I think you'll instead get a different BUG in queue_msg on line 414 of mcp2210-spi.c instead, this one: BUG_ON(!xfer->tx_buf)
. If I'm right then it will help you find your problem more easily, but if I'm wrong then let me know so I can look deeper.
I just saw your second message. I guess you deleted it? Anyway, this is a BUG instead of returning an error code so that it gets discovered during development and not after deployment, since this shouldn't happen through any normal usage of the driver.
It's hard to always be certain how to treat each exceptional condition, but when it's certain that it can only happen due to a bug somewhere, then BUG/BUG_ON are appropriate. But if you look at queue_msg
you'll see that it checks for a lot of conditions that it can't handle and returns and error code along with printing a message to the kernel log instead of a BUG_ON -- these are generally conditions that can happen through normal use, although there may be a few that I should also use BUG_ON for, like an empty transfer list, etc.
Thanks Daniel for response. Reason to delete 2nd message was , thought to give myself more time to understand the context more better.
Basically, it was hitting BUG_ON(!xfer->tx_buf);
at second iteration of linked-list.
Trying to read , JEDEC ID of chip and hence two messages were appended to queue.
snippet from drivers/spi/spi.c (spi_write_then_read)
spi_message_init(&message);
memset(x, 0, sizeof(x));
if (n_tx) {
x[0].len = n_tx;
spi_message_add_tail(&x[0], &message);
}
if (n_rx) {
x[1].len = n_rx;
spi_message_add_tail(&x[1], &message);
}
memcpy(local_buf, txbuf, n_tx);
x[0].tx_buf = local_buf;
x[1].rx_buf = local_buf + n_tx;
Sending a read ID command of size 1 byte (addresss , here it is 0x9f) and expecting response of size 6 bytes (max size of JEDEC ID). Expecting txbuf not NULL on first iteration of linked list makes sense but on second iteration, it was NULL as expected and only rx_buf will be there. Not sure, what I am missing here !
Do the mcp2210 address spi_write_then_read() call !? Where first message was meant for transmit and second for to receive . Another point to mention was , message undergoes through __spi_validate as part of function flows & no error returns .
Hi ,
mcp2210 was trying to do simple register read (triggered by spi_nor_read_id) and resulted in kcrash as shown below. MCP2210 was trying to process command - 0x42 (SPI_DATA_TRANSFER) and hit this issue at repeat-count # 3. Going by the trace, assert(0) was triggered as variable "body_size" was > 0 but "body" was NULL. (can see in the extra prints below). Any pointers please as bit new to SPI itself of which scenario lands at this case !
(Kernel version - 4.1.35-rt41)