bluekitchen / btstack

Dual-mode Bluetooth stack, with small memory footprint.
http://bluekitchen-gmbh.com
Other
1.67k stars 601 forks source link

[feature request] ipm hci transport support #219

Closed Lefix2 closed 5 years ago

Lefix2 commented 5 years ago

Hello, For internals tests, we're trying to port btstack on a Wide band Soc device. We're facing limitation of hci transport between two cpu. This is not critical but we were wondering if you planned to implement an inter processor mailbox hci transport layer in addition to the existings h4/h5 layers. This would be interesting for Soc devices as some esp/nrf or newly stm32wb.

Best regards.

Edit: After some research, we found an example of implementation for the zephyr OS and the stm32WB https://github.com/zephyrproject-rtos/zephyr/pull/14188/commits/a89ba3d27e272cad17a0970d14000204b6d6ccc5

mringwal commented 5 years ago

Hi. Interesting idea. The existing hci_transport interface only requires that an implementation can send an HCI packet to the Controller and expects to receive HCI packets on its own thread.

Sending a single packet should be possible on all platforms, without any queues. For the receive side, it depends.

For the nRF/Zephyr port, we rely on the software HCI to queue packets to us: https://github.com/bluekitchen/btstack/tree/master/port/nrf5-zephyr

For ESP32, the packets are delivered from interrupt context. Here, we've enabled Host Controller to Host Flow Control and use a large (10 kB) ring buffer to queue packets. https://github.com/bluekitchen/btstack/tree/master/port/esp32/components/btstack/main.c

Thanks for pointing to the stm3wb, I wasn't aware it had a full internal HCI layer. Guess I should order one, just in case. The new Dialog DA1469x also has a full HCI and an example app that routes it to the UART. I guess, but didn't try, that it could be used as well.

So, back to your question. If there are a few potential platforms, it might make sense to check if there are reusable parts that can be provided by the stack. For these two, I don't think there is much shared.

Lefix2 commented 5 years ago

I'm still a bit confused with all interrupts/callbacks mechanism with WB and btstack but the two examples you gave seems to be part of a solution! WB uses an hardware mailbox (interrupt generation in both side and shared RAM). Software defines two channels, one for shci (system hci for commands like start stack) and one for hci (separated in a ble buffer and a acl buffer). Moreover ST bring in its cube environment a reduced stack exposing only HCI (2x smaller than original stack). I'll try to use your examples to make bt-stack working on WB.

mringwal commented 5 years ago

Sounds good. For BTstack, the hci_transport needs to deal with the run loop environment.

What run loop do you have? Embedded: If ACL and Events are delivered via shared RAM and an IRQ, you just set a flag and process it from a polling data source - pretty much like btstack_uart_block_embedded. FreeRTOS: set a flag and trigger run loop - like on esp32, or see btstack_uart_block_freertos.

For outgoing, you just put the data in the WB buffer and signal them (however that works). When you get the 'done' interrupt, you do the same as for receiving something.

Does this make the BTstack side clear? (didn't had a chance to look at ST docs, but the Zephyr code looks good/helpful.

Lefix2 commented 5 years ago

I'm used to work with FreeRTOS so i'll deal with esp32 example and btstack_uart_block_freertos. I think btstack side is clear! :) ST side is... not enough documented but i'll effectively find my way with zephyr code!

mringwal commented 5 years ago

Please report your progress. I'd love to have/add a port for it.

Lefix2 commented 5 years ago

I'm stuck with something... Are HCI command packets half-duplex? WB bring interrupts when an ACL packet have been received but that's not the case for HCI commands. The WB is based on the command-response principle, can an HCI command not wait for a response?

If not, it is necessary to implement the can_send_packet_now hci_transport's membrer?

mringwal commented 5 years ago

Well, "kind of"

There's a num hci command that can be send by the host. However, I chose to only send one at a time.

Each HCI Command triggers an HCI Event. You either get an Command Complete event or an Command Status Event that contain the Opcode of the previous HCI command.

So, BTstack uses HCI Commands effectively in half-duplex mode. Does this align with the WB command-response principle?

Lefix2 commented 5 years ago

Yes it seems! if effectively "Each HCI Command triggers an HCI Event" this should be ok! So the can_send_packet_now is not necessary in my case if i put a semaphore (released at ACK) when sending an ACL?

mringwal commented 5 years ago

hm... for HCI, BTstack will not send a second HCI Command before it gets the HCI Command Complete or Status. In that case, you can emit a TRANSPORT SENT event the moment you send it to WB. can send now can then always return TRUE for packet type HCI.

For ACL, you can use a single flag. (probably what you've suggested, I just understand you now)

Lefix2 commented 5 years ago

I managed to get some code running but bt_stack never send BTSTACK_EVENT_STATE. It seems that it don't send any hci packet at all... Are these minimal calls enough to set a discoverable device with battery service?

int                 gatt_client_setup                   (   void
                                                        )
{
    att_server_init(profile_data, NULL, NULL);
    //att_server_register_packet_handler(hci_packet_handler);

    // setup advertisements
    uint16_t adv_int_min = 0x0030;
    uint16_t adv_int_max = 0x0030;
    uint8_t adv_type = 0;
    bd_addr_t null_addr;
    memset(null_addr, 0, 6);
    gap_advertisements_set_params(adv_int_min, adv_int_max, adv_type, 0, null_addr, 0x07, 0x00);
    gap_advertisements_set_data(adv_data_len, (uint8_t*) adv_data);

    return 0;
}

int btstack_main(int argc, const char * argv[])
{
    l2cap_init();

    security_manager_setup();

    gatt_client_setup();

    // setup battery service
    battery_service_server_init(0);

    // turn on!
    hci_power_control(HCI_POWER_ON);

    gap_advertisements_enable(1);
}
mringwal commented 5 years ago

Hi. Could you call hci_dump_open(..) to enable HCI logging? The hci_power_control(HCI_POWER_ON) is sufficient to get started and check if everything works.

Lefix2 commented 5 years ago

I already called it in the app_main

int app_main(void const* args){

    printf("BTstack: setup\n");

    // enable packet logger
    hci_dump_open(NULL, HCI_DUMP_STDOUT);

    /// GET STARTED with BTstack ///
    btstack_memory_init();
    btstack_run_loop_init(btstack_run_loop_freertos_get_instance());

    // init HCI
    hci_init(transport_get_instance(), NULL);

    // inform about BTstack state
    hci_event_callback_registration.callback = &packet_handler;
    hci_add_event_handler(&hci_event_callback_registration);

    btstack_main(0, NULL);

    printf("BTstack: execute run loop\n");
    btstack_run_loop_execute();
    return 0;
}

here is all logs i get:

boot_main:384: Starting os...
BTstack: setup
[00:00:00.065] LOG -- btstack_run_loop_freertos.c.260: run loop task 536874896
[00:00:00.072] LOG -- btstack_run_loop_freertos.c.263: run loop init, queue item size 8
[00:00:00.081] LOG -- hci.c.2931: hci_power_control: 1, current mode 0
mringwal commented 5 years ago

ok. then also add a few log_infos to can_send_now of hci_transport_t (and it's return value), send, and when you get a response. BTstack should try to send an HCI Reset first.

Lefix2 commented 5 years ago

Execution never call can_send_now or send_packet... I know that i don't implemented a flash_bank instance, perhaps it comes from there?

mringwal commented 5 years ago

unless the flash crashes, it is not relevant yet. can you single step into hci_power_control(1).. ? :) (What dev kit do you have? Most STM dev kits allow to flash a Segger J-Link OB and then you can use comfortably use Segger Ozone in evaluation mode for a quick check)

Lefix2 commented 5 years ago

I spoke to early, i think log_info don't display anything (printf does).

I'm using WB nucleo with openocd (i can single step), i come back to you when i got more logs and done hci_power_control check!

Lefix2 commented 5 years ago

Ok it was a conflict with my old ISR priority and FreeRTOS MAXSysInterrupt, i got more logs now, but device don't completely start:

boot_main:387: Starting os...
[00:00:00.063] LOG -- btstack_run_loop_freertos.c.260: run loop task 536874896
[00:00:00.071] LOG -- btstack_run_loop_freertos.c.263: run loop init, queue item size 8
[00:00:00.079] LOG -- main.c.355: transport_register_packet_handler
[00:00:00.086] LOG -- sm.c.3854: sm: generate new ec key
[00:00:00.091] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.100] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.108] LOG -- hci.c.2931: hci_power_control: 1, current mode 0
[00:00:00.115] LOG -- main.c.275: transport_init
[00:00:00.120] LOG -- main.c.280: shared SRAM2 buffers
[00:00:00.125] LOG -- main.c.281:  *BleCmdBuffer          : 0x2003009C
[00:00:00.132] LOG -- main.c.282:  *HciAclDataBuffer      : 0x200301C8
[00:00:00.138] LOG -- main.c.283:  *SystemCmdBuffer       : 0x200302D0
[00:00:00.145] LOG -- main.c.284:  *EvtPool               : 0x200303DC
[00:00:00.152] LOG -- main.c.285:  *SystemSpareEvtBuffer  : 0x20030918
[00:00:00.159] LOG -- main.c.286:  *BleSpareEvtBuffer     : 0x20030A24
[00:00:00.165] LOG -- main.c.331: transport_open
[00:00:00.170] LOG -- main.c.335: BLE stack on CPU 2 running
[00:00:00.176] LOG -- hci.c.4063: BTSTACK_EVENT_STATE 1
[00:00:00.181] EVT <= 60 01 01 
[00:00:00.184] LOG -- btstack_crypto.c.948: BTSTACK_EVENT_STATE
[00:00:00.190] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.199] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.207] LOG -- hci.c.1203: hci_initializing_run: substate 0, can send 1
[00:00:00.215] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.223] CMD => 03 0C 00 
[00:00:00.227] LOG -- main.c.379: transport_send_packet 0x01  size:3
[00:00:00.233] LOG -- main.c.457: btstack executing run loop...
[00:00:00.239] LOG -- btstack_run_loop_freertos.c.180: RL: execute
[00:00:00.245] EVT <= 6E 00 
[00:00:00.248] LOG -- main.c.363: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
mringwal commented 5 years ago

well. the first HCI Reset is sent (03 0c 00). Now, we're missing the response to that. And without that, transport_can_send_packet_now probably should not return true. So, maybe we still need a bit of deep packet inspection for this. can you return true only once and try to receive the HCI Command Complete Event?

Lefix2 commented 5 years ago
[00:00:00.181] EVT <= 60 01 01 
[00:00:00.184] LOG -- btstack_crypto.c.948: BTSTACK_EVENT_STATE
[00:00:00.190] LOG -- main.c.370: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.199] LOG -- main.c.370: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.207] LOG -- hci.c.1203: hci_initializing_run: substate 0, can send 1
[00:00:00.215] LOG -- main.c.370: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.223] CMD => 03 0C 00 
[00:00:00.226] LOG -- main.c.386: transport_send_packet 0x01  size:3
[00:00:00.233] LOG -- main.c.464: btstack executing run loop...
[00:00:00.239] LOG -- btstack_run_loop_freertos.c.180: RL: execute
[00:00:00.245] EVT <= 6E 00 
[00:00:00.248] LOG -- main.c.370: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 0

I understand that controller doesn't send any hci event? i'll look into this!

Lefix2 commented 5 years ago

After changing stack from HCI only to complete stack it seems to respond... Since ST hasn't released any documentation on it, I'll consider using complete stack by default. I have some improvements but the stack still does not start completely, have you any idea?

[00:00:00.379] LOG -- hci.c.4135: BTSTACK_EVENT_STATE 1
[00:00:00.385] EVT <= 60 01 01 
[00:00:00.388] LOG -- btstack_crypto.c.948: BTSTACK_EVENT_STATE
[00:00:00.394] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.402] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.411] LOG -- hci.c.1209: hci_initializing_run: substate 0, can send 1
[00:00:00.418] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.427] CMD => 03 0C 00 
[00:00:00.430] LOG -- main.c.382: transport_send_packet 0x01  size:3
[00:00:00.437] LOG -- main.c.460: btstack executing run loop...
[00:00:00.443] LOG -- btstack_run_loop_freertos.c.180: RL: execute
[00:00:00.449] EVT <= 6E 00 
[00:00:00.452] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.460] EVT <= 0E 04 01 03 0C 00 
[00:00:00.464] LOG -- hci.c.1538: Command complete for expected opcode 0c03 at substate 1
[00:00:00.473] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.481] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.490] LOG -- hci.c.1209: hci_initializing_run: substate 2, can send 1
[00:00:00.497] LOG -- main.c.366: transport_can_send_packet_now HCI_COMMAND_DATA_PACKET 1
[00:00:00.506] CMD => 01 10 00 
[00:00:00.509] LOG -- main.c.382: transport_send_packet 0x01  size:3
[00:00:00.516] EVT <= 0E 0C 01 01 10 00 09 28 00 09 30 00 28 21 
[00:00:00.522] LOG -- hci.c.1992: Manufacturer: 0x0030
[00:00:00.527] LOG -- hci.c.1538: Command complete for expected opcode 1001 at substate 3
mringwal commented 5 years ago

No idea about HCI only vs. full stack. Full stack should not provide you with ACL packets in most setups (just some idea, I've never tried).

in the log, you didn't sent the Transport Sent event 6E 00. It's there after the HCI Reset but not after the read local version information command. If you send that as well, it might work...

Lefix2 commented 5 years ago

Just right! :) Do we need to send the SENT_EVENT for acl packets too?

For now device is discoverable, i just don't see battery service I think it can comes from controller bad sizes computation.. Is there any function in btStack returning num of GATT attributes?

I also wanted to know how you usually write your ports. I understand you used a different HAL from STMicroelectronics. The port file must contain all the hardware initialization? I forked develop to make port, i can expose it on my account when finished?

mringwal commented 5 years ago

The Transport Sent is required after every HCI packet (incl. ACL). If the stack starts up fully, the GATT Service should work, too. (Post logs...) There are a few experimental functions to inquiry our own GATT DB, but usually it's fixed, so you know how many there are.

Ports depend on the 'default' environment. I've used STM32Cube for STM32 MCUs. You're free to fork in general - obviously, the copyright needs to stay intact. Then, you can also add a port.

Lefix2 commented 5 years ago

It starts up fully, but i can't see battery service when connected (in some case 1 service is showed in advertising). I got disconnect after some time too, I'm pretty sure I've already seen this bug with ST controller in the past I'll explore that. How btStack manage the DB space usage? ST stack is the first stack I use where we have to calculate number of GATT attributes by hand (which is tedious by the way).

Ok so just make include to HAL without bringing sources will be ok? My other question is does ports are standalone files or more like examples? I mean do i have to register an hci packet handler or just call btstack_main and let user call gatt, l2cap etc. functions? Of course i'll let the copyright, it's just to know if you want ready to merge code or some code to get inspiration from?

EDIT: I think the good example to follow for port is stm32-l073rz-nucleo-em9304 folder!?

mringwal commented 5 years ago

The task of the port file is to properly setup HCI - HCI transport and chipset driver - and also setup persistent storage (mainly TLV now). Then, all examples should work.

If it starts up (state = working), all examples should work. You can try the le_counter and check the HCI log in Wireshark.

GATT DB: in most cases incl. all examples. the binary representation is created during build and it can be stored in Flash. (att_db_util, is optional). I don't understand why you want to calculate number of GATT attributes? Why do you need that?

The stm32-l073rz-nucleo-em9304 on develop is the latest one I did. Yes, it is suitable as a starting point. How's the STM Flash API? Same as for L0 or F4, or different again?

With that one, I've added the minimal set of HAL files to make it compile. Not sure if that's a good idea though, as HALs / SDKs tend to get large recently (asides from NXP that provide customized SDKs per MCUs).

Lefix2 commented 5 years ago

I've set up port to be compatible with your examples, you can check the current branch on: https://github.com/Lefix2/btstack/tree/stm32wb_port

To answer your first question, the start procedure of WB M0 ask for number of GATT attributes, memory to allocate etc... I thought this call was responsible for the bug but it seems that in "HCI stack mode" these parameters are not used (I put them to 0 and the behavior is the same)

The flash API is the same so for now I've duplicate hal_flash_bank_stm32.c but not implemented it. This give me:

[00:00:00.700] LOG -- sm.c.3004: Persistent IR not set with sm_set_ir. Use of private addresses will cause pairing issues
[00:00:00.711] LOG -- sm.c.3008: Persistent ER not set with sm_set_er. Legacy Pairing LTK is not secure
[00:00:00.720] LOG -- sm.c.3011: Please configure btstack_tlv to let BTstack setup ER and IR keys

then at connection

[00:00:25.486] LOG -- att_server.c.297: SM_EVENT_IDENTITY_RESOLVING_FAILED
[00:00:25.545] LOG -- hci.c.460: ACL classic buffers: 0 used of 0

Concerning the HAL, I think (regarding of the number of examples you got) you made the right choice, ST bring a new HAL for each familly (L0, L4, WB, F7...) so it'll makes a lot of code.

mringwal commented 5 years ago

Thanks for the sharing your code.

Yupp. The HCI implementation should not be bothered with the number of GATT Characteristics. It only sends/receives ACL packets.

Without a flash implementation, only legacy pairing (with the restrictions listed in the log) is availabe, but LE Secure Connections is not possible. Also, the ATT Server is supposed to store GATT Subscriptions.

If there are more STM ports, it would make sense to place the STM HAL (L0, L4, ..) in e.g. 3rd-party or platform/stm.

I'll order a STM32WB board when I have some time to try it.

Lefix2 commented 5 years ago

I'm still searching the problem origin without success... For now i'm implementing a hal_flash for WB because it's different as you said (no sectors, only page erase and double word program)

Lefix2 commented 5 years ago

Same problem with tlv implementation! I've dumped two hci logs, you'll probably have a better understanding of the situation than me.

Each time a disconnection happened within 30sec

mringwal commented 5 years ago

hi. please enable log_info, process the logs with tool/create_packet_log.py - you can then open them with Wireshark.

If there's a disconnect after 30 seconds, either an ATT or SM request wasn't answered. If you run le_counter, att should be answered, and there should be no pairing. Alternatively, try sm_pairing_peripheral (it has different options in the code to enable Secure Connections).

Lefix2 commented 5 years ago

Hi, thanks for answer! I sent .txt because github don't allow to upload pklg files. Whith log_info enabled, i see that transaction end when negotiating MTU.

An MTU request of 527 is requested and an MTU of 252 is responded, then only a disconnect Complete event happen.

connection_issue

Edit: with PC i spied HCI with btmon: connection_issue_host

ATT MTU is an ACL packet? Is response an ACL packet too? Perhaps i made a mistake in ACL packet sending so?

mringwal commented 5 years ago

hi. you can upload zip files.

hm. My best guess is that BTstack tries to send the ACL packet, but it is never actually sent. if you have some Nordic dev kits, you can try their or my LE Sniffer -> https://github.com/bluekitchen/raccoon

Lefix2 commented 5 years ago

I've edited my previous post, i think it's effectively an ACL packet never sent

mringwal commented 5 years ago

Everything that is not an HCI Event (from Controller), or HCI Command (to Controller), is an ACL packet. So, yes, you've confirmed my suspicion. Now, you need to figure out why it is not sent. Do you get the 'ACL packet sent' callback?

Well, there's another HCI Command that should only be sent after the previous 'sent' was completed. So, you probably got the 'ACL sent' callback, but it wasn't.

Lefix2 commented 5 years ago

I finally get it working thanks :) It was a buffer shift in ACL sending. Moreover, WB was responding "ACL ACK" what was a "lie". Without documentation it was hard to find problem, hopefully i've re-checked zephyr code and i saw this line:

https://github.com/erwango/zephyr/blob/376c85769ddd8d7f76c57a2c65ca28a5d4f00122/drivers/bluetooth/hci/ipm_stm32wb.c#L265

I'll push correction commit so. Do you accept pull requests? Just to know what to write in headers file to keep some "credits" as authors and keep your license