gartnera / headunit

Headunit for Android Auto
GNU Affero General Public License v3.0
347 stars 90 forks source link

FW v70 libusb issue #147

Open Trevelopment opened 5 years ago

Trevelopment commented 5 years ago

I have been testing on v70.00.100 without the upgraded USB hub and I narrowed down the issue to an updated version of libusb (actually it changed to libusbx, a fork of libusb). Could someone with knowledge of libusb take a look at this? Or it could possibly be the ssl handshake that is failing? here is what happens when you try to connect on v70 (libusb_get_version: version: 1.0.15.10646):

W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x1d6b : 0x0002
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x1d6b : 0x0002
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x0424 : 0x2640
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x1997 : 0x2433
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x18d1 : 0x4ee7
W: /home/mazda/headunit/hu/hu_usb.cpp:399: Start : Device 0x18d1 : 0x4ee7 responded with protocol ver 2
W: /home/mazda/headunit/hu/hu_usb.cpp:457: Start : Wating for the device to reconnect
W: /home/mazda/headunit/hu/hu_uti.cpp:375: wait_for_device_connection : udev device removeed | node:/dev/bus/usb/002/004, subsystem:usb, devtype:usb_device

W: /home/mazda/headunit/hu/hu_uti.cpp:375: wait_for_device_connection : udev device added | node:/dev/bus/usb/002/005, subsystem:usb, devtype:usb_device

W: /home/mazda/headunit/hu/hu_usb.cpp:478: Start : Found OAP Device
W: /home/mazda/headunit/hu/hu_usb.cpp:486: Start : OK libusb_claim_interface usb_err: 0 (Success)
W: /home/mazda/headunit/hu/hu_usb.cpp:500: Start : Done get_config_descriptor config: 0x674608  num_int: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:525: Start : iusb_ep_in: 0x81
W: /home/mazda/headunit/hu/hu_usb.cpp:532: Start : iusb_ep_out: 0x01
W: /home/mazda/headunit/hu/hu_usb.cpp:252: libusb_callback : libusb_callback 5 0 6 # ( 5 = LIBUSB_TRANSFER_NO_DEVICE)
E: /home/mazda/headunit/hu/hu_usb.cpp:291: libusb_callback : libusb_callback: abort
W: /home/mazda/headunit/hu/hu_usb.cpp:305: libusb_callback_send : libusb_callback_send 5 0 6 # ( 5 = LIBUSB_TRANSFER_NO_DEVICE)
E: /home/mazda/headunit/hu/hu_usb.cpp:309: libusb_callback_send : libusb_callback: abort
W: /home/mazda/headunit/hu/hu_usb.cpp:231: usb_recv_thread_main : Requested to exit
W: /home/mazda/headunit/hu/hu_usb.cpp:240: usb_recv_thread_main : libusb_handle_events_completed: 2 (hu_STATE_STARTED)
W: /home/mazda/headunit/hu/hu_usb.cpp:242: usb_recv_thread_main : USB thread exit
E: /home/mazda/headunit/hu/hu_aap.cpp:99: hu_aap_tra_recv : errorf was signaled
E: /home/mazda/headunit/hu/hu_aap.cpp:1395: hu_aap_recv_process : Recv have_len: -1
libusbx: error [op_release_interface] release interface failed, error -1 errno 22
E: /home/mazda/headunit/hu/hu_usb.cpp:196: Stop : Done libusb_release_interface usb_err: -99 (Other error)
libusbx: error [op_release_interface] release interface failed, error -1 errno 22
E: main.cpp:236: main : Something bad happened

For reference this is the log when AA opens on any lower FW version and just to make sure I used v59.00.545 (libusb_get_version: version: 1.0.9.0):

W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x1d6b : 0x0002
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x1d6b : 0x0002
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x0424 : 0x2640
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x0781 : 0x5583
W: /home/mazda/headunit/hu/hu_usb.cpp:382: Start : Opening device 0x18d1 : 0x4ee7
W: /home/mazda/headunit/hu/hu_usb.cpp:399: Start : Device 0x18d1 : 0x4ee7 responded with protocol ver 2
W: /home/mazda/headunit/hu/hu_usb.cpp:457: Start : Wating for the device to reconnect
W: /home/mazda/headunit/hu/hu_uti.cpp:375: wait_for_device_connection : udev device removeed | node:/dev/bus/usb/002/004, subsystem:usb, devtype:usb_device

W: /home/mazda/headunit/hu/hu_uti.cpp:375: wait_for_device_connection : udev device added | node:/dev/bus/usb/002/005, subsystem:usb, devtype:usb_device

W: /home/mazda/headunit/hu/hu_usb.cpp:478: Start : Found OAP Device
W: /home/mazda/headunit/hu/hu_usb.cpp:486: Start : OK libusb_claim_interface usb_err: 0 (Success)
W: /home/mazda/headunit/hu/hu_usb.cpp:500: Start : Done get_config_descriptor config: 0x690b60  num_int: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:525: Start : iusb_ep_in: 0x81
W: /home/mazda/headunit/hu/hu_usb.cpp:532: Start : iusb_ep_out: 0x02
W: /home/mazda/headunit/hu/hu_ssl.cpp:209: hu_ssl_begin_handshake : SSL_do_handshake() ret: -1
W: /home/mazda/headunit/hu/hu_ssl.cpp:65: send_ssl_handshake_packet : BIO_read() HS client req ret: 307
W: /home/mazda/headunit/hu/hu_ssl.cpp:232: hu_handle_SSLHandshake : BIO_write() server rsp ret: 2311
W: /home/mazda/headunit/hu/hu_ssl.cpp:235: hu_handle_SSLHandshake : SSL_do_handshake() ret: -1
W: /home/mazda/headunit/hu/hu_ssl.cpp:65: send_ssl_handshake_packet : BIO_read() HS client req ret: 1224
W: /home/mazda/headunit/hu/hu_ssl.cpp:232: hu_handle_SSLHandshake : BIO_write() server rsp ret: 51
W: /home/mazda/headunit/hu/hu_ssl.cpp:235: hu_handle_SSLHandshake : SSL_do_handshake() ret: 1
W: /home/mazda/headunit/hu/hu_ssl.cpp:261: hu_handle_SSLHandshake :   SET: iaap_state: 2 (hu_STATE_STARTED)
W: /home/mazda/headunit/hu/hu_aap.cpp:1219: hu_aap_start : Starting HU thread
Starting Android Auto...
Trevelopment commented 5 years ago

Can anyone help me with this @mishan? @lmagder? @borconi? @mishaaq? Come on I need assistance from some C++ pros here please.

borconi commented 5 years ago

Sorry @trevelopment I don't know C++... :(

lmagder commented 5 years ago

I haven't tried on the new FW but just looking at your logs and the code maybe this is a clue:

The failing version is saying that it's trying to write to an invalid device, but I noticed that they pick different endpoints. Endpoints are like of like ports if we were doing TCP so the code is listening on 0x81 in both, but writing on 0x2 in the working one and 0x1 in the failing one (iusb_ep_in and iusb_ep_out). This is weird since the device (eg the phone) defines these, so they should be the same assuming you are using the same device each time. The code searches through all the ones it enumerates to find the correct one (the bulk transfer stream) at https://github.com/gartnera/headunit/blob/master/hu/hu_usb.cpp#L502

Maybe this code is buggy and the exact layout of enumeration list is different now so we are seeing the bug?

For one I could see that https://github.com/gartnera/headunit/blob/master/hu/hu_usb.cpp#L521 should be if ((ep_add & LIBUSB_ENDPOINT_DIR_MASK) == LIBUSB_ENDPOINT_IN ) and else if ((ep_add & LIBUSB_ENDPOINT_DIR_MASK) == LIBUSB_ENDPOINT_OUT)

Maybe libusbx adds more of these flags so we can't just test against 0?

Or maybe when it's looping over the libusb_interfaces in the outer loop we need to filter somehow and before we were just getting lucky that the right one was first?

Maybe dumping out the entire libusb_config_descriptor in both cases (or even just PC vs v70 since PC works) would reveal what we are doing wrong since it should be choosing the same 0x81/0x02 endpoint in both cases I think.

EDIT: Also we are always getting the 0th config, we could call libusb_get_active_config_descriptor instead or repeatedly call libusb_get_config_descriptor until it fails increasing indexes to get all of them. I'm not sure how to filter them though.

EDIT2: 😄 Wait I see now, we probably need to loop over the configs until we find the one with bDescriptorType == LIBUSB_DT_ENDPOINT, maybe it's not config 0 anymore.

borconi commented 5 years ago

By the way I haven't following development on this for a while but, for really good performance you should use 2 separate threads one which reads and one which writes. AA actually works on multithread so for a nice performance you should really use 2 separate threads. Not sure if this proves as already implemented or not.

lmagder commented 5 years ago

@borconi we actually do, there is usb_recv_thread_main which is a dedicated thread to receive the usb data and put in the pipe to the HU thread. I was hoping it would help with stuttering, but it didn't. 😢 But you are right it is more efficient and for the rest of the code reading from a pipe is simpler than using libusb (and also abstracts the difference between WIFI and USB)

borconi commented 5 years ago

@Imagder I'll post here my Java class I'm using. My latest headunit reloaded really flies. I use a main thread for reading and a few more others for processing and sending. So the main thread reads one message (abstract from TCP or USB) looks at the header and pass it to a spearate processor based on the header. For example if it's audio I have an audio processor, which sends back an acknowledgement to the phone and then hands over the audio data to another thread which is processing the audio, same with video data, video is processed on yet another thread, GPS is running on a system and I have a listener which then sends GPS data to car but again that's a separate thread, so idea is that each action has its own thread that way there is no shuttering what so ever. Hope it makes sense. If you want the full source code (pure java) send me an email to borconie@gmail.com happy to email it over for referencing (it's not on GitHub)

Peck07 commented 5 years ago

@lmagder I don't understand your logic here "For one I could see that https://github.com/gartnera/headunit/blob/master/hu/hu_usb.cpp#L521 should be if ((ep_add & LIBUSB_ENDPOINT_DIR_MASK) == LIBUSB_ENDPOINT_IN )" & is a bitwise operator, so unless LIBUSB_ENDPOINT_DIR_MASK and LIBUSB_ENDPOINT_IN are the same constants, the condition will never be true (edit: it seems like the two constants have the same value). With the same logic, it doesn't matter if libusbx adds more flags, as the code checks for one flag at a time (I also hate these little things in c++ ,but anything not 0 is true https://www.geeksforgeeks.org/bool-data-type-in-c/ )

GrayHatter commented 5 years ago

I'm having trouble following this issue... @Trevelopment what was changed?

Trevelopment commented 5 years ago

Hey this is all good stuff thanks @lmagder I knew you would point me in the right direction.

@GrayHatter On v70 FW AA doesn't start up it just crashes over and over again with the output in the first post. I think its a libusb issue Im gonna try what @lmagder said to do and see if that works.

@borconi is your headunit reloaded designed to run on another android device or a raspberry pi? How tough do you think it would be to rewrite it to work on the MZD CMU? I am pretty proficient in Java at least more than I am at C++ that's for sure and at this point I know the integration parts pretty well I may even be able to write them in Java from scratch. Anyways I set you an email.

GrayHatter commented 5 years ago

does java even run on the mazda IVI?

borconi commented 5 years ago

@GrayHatter No I don't think Mazda units run Java out of the box, and to answer @Trevelopment my app is for other Android devices. OpenAuto might be another good point of inspiration: https://github.com/f1xpl/openauto or CranckShaft https://github.com/opencardev/crankshaft both builds should run on Raspberry Pi, which means it should work on the Mazda as well.

Trevelopment commented 5 years ago

You can compile Java using the m3 toolchain POC is the Screencast Receiver so theoretically if you put in the CMU specific code you can compile and run headunit written in Java. I may explore that but it might still be a lot of work.

Trevelopment commented 5 years ago

So trip on this, I made a few changes based on your suggestions @lmagder and now it looks like it is about to almost work:

W: /home/mazda/headunit/hu/hu_usb.cpp:478: Start : Found OAP Device
W: /home/mazda/headunit/hu/hu_usb.cpp:486: Start : OK libusb_claim_interface usb_err: 0 (Success)
W: /home/mazda/headunit/hu/hu_usb.cpp:500: Start : Done get_config_descriptor config: 0x6db2d0  num_int: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:506: Start : num_altsetting: 1
W: /home/mazda/headunit/hu/hu_usb.cpp:514: Start : num_eps: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:523: Start : ep_add: 129 ep_mask (128)
W: /home/mazda/headunit/hu/hu_usb.cpp:528: Start : iusb_ep_in: 0x81
W: /home/mazda/headunit/hu/hu_usb.cpp:523: Start : ep_add: 1 ep_mask (0)
W: /home/mazda/headunit/hu/hu_usb.cpp:535: Start : iusb_ep_out: 0x01
W: /home/mazda/headunit/hu/hu_ssl.cpp:209: hu_ssl_begin_handshake : SSL_do_handshake() ret: -1
W: /home/mazda/headunit/hu/hu_ssl.cpp:65: send_ssl_handshake_packet : BIO_read() HS client req ret: 307
E: /home/mazda/headunit/hu/hu_usb.cpp:291: libusb_callback : libusb_callback: abort
W: /home/mazda/headunit/hu/hu_usb.cpp:231: usb_recv_thread_main : Requested to exit
W: /home/mazda/headunit/hu/hu_usb.cpp:240: usb_recv_thread_main : libusb_handle_events_completed: 2 (hu_STATE_STARTED)
W: /home/mazda/headunit/hu/hu_usb.cpp:242: usb_recv_thread_main : USB thread exit
E: /home/mazda/headunit/hu/hu_aap.cpp:99: hu_aap_tra_recv : errorf was signaled
E: /home/mazda/headunit/hu/hu_aap.cpp:1395: hu_aap_recv_process : Recv have_len: -1
libusbx: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusbx: error [do_close] A cancellation hasn't even been scheduled on the transfer for which the device is closing
E: main.cpp:236: main : Something bad happened

So what changed? I am not getting the [op_release_interface] error anymore now I see the common [do_close] errors that I have seen before so the device was connected for a quick second. but even though iusb_ep_out didn't change it looks like it is about to start working so it seems like I am going in the right direction. I guess I just don't understand the endpoints completely but I feel like I am getting close. Also, I cant open an issue in the m3 toolchain repo to update libusb-1.0 to v1.0.15 (or maybe v1.0.16?) to match the version in v70 but I'm going to suggest that we try that. @lmagder I don't have a lot of experience with toolchains so I may need a little help with that one. I am going to try compiling a new libusb binary v 1.0.15 and see how that goes or can I just take the binary file straight from the CMU is that how it works?

lmagder commented 5 years ago

@Trevelopment I'm not sure that would help, we're not actually using the version of libusb in the m3 toolchain (btw I turned on issues, not sure what was up with that) at runtime. It uses the libusb.so on the car, we are just using the function definitions in libusb.h to build our code so as long as the function names, parameters, struct members, etc are the same it should work. Libusb is designed so that they don't randomly change stuff in incompatible ways since that would break programs on your Linux PC when you upgraded so they might add new stuff, but wouldn't change or remove old stuff. So like for example, that's why I was wondering about the (ep_add & LIBUSB_ENDPOINT_DIR_MASK) == LIBUSB_ENDPOINT_IN code. It's possible that the libusb on the car might have additional flags (that we don't care about) that it returns that aren't in our header or any header we have access to. The standard libusb (so far at least) defines

enum libusb_endpoint_direction { LIBUSB_ENDPOINT_IN = 0x80, LIBUSB_ENDPOINT_OUT = 0x00 }

but later they could add something like

enum libusb_endpoint_direction { LIBUSB_ENDPOINT_IN = 0x80, LIBUSB_ENDPOINT_OUT = 0x00 LIBUSB_ENDPOINT_SOMEOTHERTHING = 0x05, }

That would make our test against 0 (since LIBUSB_ENDPOINT_DIR_MASK == LIBUSB_ENDPOINT_IN as far as we know) which doesn't technically obey their API fail. There could be multiple issues like this which make the code fail. This kind of stuff is the same reason that program that totally worked on Windows XP stopped working on Win 10, it depended on stuff that was true on XP, but not part of what MS intended to make the API specify. 😛

EDIT: Sorry I got kind of rambling 😄 but my point was yeah I think your approach of going over the code carefully and making sure we handle all the weird cases and use the libusb API 100% correctly without assuming anything is probably the best approach to fix this but also keep it working on the old CMU version.

lmagder commented 5 years ago

Also, from your log, it seems like it's falling into the else case of https://github.com/gartnera/headunit/blob/master/hu/hu_usb.cpp#L254 when finishing the first read. Maybe the status is something other than LIBUSB_TRANSFER_COMPLETED or LIBUSB_TRANSFER_OVERFLOW and we need to handle that case too now

This might help explain more about the libusb transfer object and how it works: http://libusb.sourceforge.net/api-1.0/group__asyncio.html

Trevelopment commented 5 years ago

Thanks I was totally looking at that sourceforge page already for reference. The transfer errors that were coming up before were LIBUSB_TRANSFER_NO_DEVICE but now they are just LIBUSB_TRANSFER_ERROR so thats an improvement I guess but I will keep looking over the code trying different things.

mfarreras commented 5 years ago

Hello!

Today I did the serial procedure with also v70 and tried with Android Auto from AIO. Sound doesn't work but touch yes. A minor bug is that home button is on the right.

I'm a computer science engineer and I would love to help, but I need some guidance...

Thanks to the devs that made this possible!

szsong commented 4 years ago

Hello!

Today I did the serial procedure with also v70 and tried with Android Auto from AIO. Sound doesn't work but touch yes. A minor bug is that home button is on the right.

I'm a computer science engineer and I would love to help, but I need some guidance...

Thanks to the devs that made this possible!

Same here. Do you have the updated usb hub for AA/carplay? My car has the official AA upgrade but the AA does not allow touch screen and no AA-HUD integration either. Updated my firmware to v70.00.335 and it is the same.

Mibix commented 4 years ago

Wait you guys are able to get this to work at all on 70+?

Peck07 commented 4 years ago

So trip on this, I made a few changes based on your suggestions @lmagder and now it looks like it is about to almost work:

W: /home/mazda/headunit/hu/hu_usb.cpp:478: Start : Found OAP Device
W: /home/mazda/headunit/hu/hu_usb.cpp:486: Start : OK libusb_claim_interface usb_err: 0 (Success)
W: /home/mazda/headunit/hu/hu_usb.cpp:500: Start : Done get_config_descriptor config: 0x6db2d0  num_int: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:506: Start : num_altsetting: 1
W: /home/mazda/headunit/hu/hu_usb.cpp:514: Start : num_eps: 2
W: /home/mazda/headunit/hu/hu_usb.cpp:523: Start : ep_add: 129 ep_mask (128)
W: /home/mazda/headunit/hu/hu_usb.cpp:528: Start : iusb_ep_in: 0x81
W: /home/mazda/headunit/hu/hu_usb.cpp:523: Start : ep_add: 1 ep_mask (0)
W: /home/mazda/headunit/hu/hu_usb.cpp:535: Start : iusb_ep_out: 0x01
W: /home/mazda/headunit/hu/hu_ssl.cpp:209: hu_ssl_begin_handshake : SSL_do_handshake() ret: -1
W: /home/mazda/headunit/hu/hu_ssl.cpp:65: send_ssl_handshake_packet : BIO_read() HS client req ret: 307
E: /home/mazda/headunit/hu/hu_usb.cpp:291: libusb_callback : libusb_callback: abort
W: /home/mazda/headunit/hu/hu_usb.cpp:231: usb_recv_thread_main : Requested to exit
W: /home/mazda/headunit/hu/hu_usb.cpp:240: usb_recv_thread_main : libusb_handle_events_completed: 2 (hu_STATE_STARTED)
W: /home/mazda/headunit/hu/hu_usb.cpp:242: usb_recv_thread_main : USB thread exit
E: /home/mazda/headunit/hu/hu_aap.cpp:99: hu_aap_tra_recv : errorf was signaled
E: /home/mazda/headunit/hu/hu_aap.cpp:1395: hu_aap_recv_process : Recv have_len: -1
libusbx: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusbx: error [do_close] A cancellation hasn't even been scheduled on the transfer for which the device is closing
E: main.cpp:236: main : Something bad happened

So what changed? I am not getting the [op_release_interface] error anymore now I see the common [do_close] errors that I have seen before so the device was connected for a quick second. but even though iusb_ep_out didn't change it looks like it is about to start working so it seems like I am going in the right direction. I guess I just don't understand the endpoints completely but I feel like I am getting close. Also, I cant open an issue in the m3 toolchain repo to update libusb-1.0 to v1.0.15 (or maybe v1.0.16?) to match the version in v70 but I'm going to suggest that we try that. @lmagder I don't have a lot of experience with toolchains so I may need a little help with that one. I am going to try compiling a new libusb binary v 1.0.15 and see how that goes or can I just take the binary file straight from the CMU is that how it works?

Just went through the code again... but could you tell what were the changes that made the code to start the ssl handshake? Also I just realized the code finds the first iusb_ep_in and iusb_ep_out, maybe that's the problem with the new libusb. Just for the sake of a quick and dirty test, you could just set the iusb_ep_out to 0x02.

silverchris commented 3 years ago

Was there any code changes to fix this? I just built a copy, and statically linked in libusb 1.0.15.10646, and it seems to work. Might just need it upgraded in our toolchain. I am working on a toolchain that has an updated GCC, so it was easy to throw in a matching version of libusb

mellda commented 3 years ago

@silverchris , is the toolchain updated now? I can see this version in readme here

https://github.com/silverchris/mazda3-build-tools/commit/77e43bf56b7e923ec838bec2f173e83a303a664e

in mazda3-build-tools repo. Your commit on the same day as your comment here.