pylessard / python-udsoncan

Python implementation of UDS (ISO-14229) standard.
MIT License
586 stars 202 forks source link

Getting timeout while sending 4KB packet over normal CAN #43

Closed wanam closed 3 years ago

wanam commented 4 years ago

I am using this lib with the can-isotp kernel module, and I try to start a data transfer 0x36 request with max data of 4KB over normal CAN (or with CAN-FD while setting tx_dl to 8 bytes) and STmin = 10.

During my tests, sending this packet of multi-frame takes more than 5 seconds to complete.

So I had to increase the timeout (0.1) set bellow to 10 seconds to avoid a socket timeout: https://github.com/pylessard/python-udsoncan/blob/master/udsoncan/connections.py#L260

According to UDS standard, there are no limitations for the sending duration, and according to a comment you put here, this timeout was meant for receive only, but I don't understand how it you should be able to do that, the socket's timeout will be triggered for the sending as well if it passes the 100ms.

But increasing this timeout is not enough, because I am getting a P2 timeout, because somehow the socket->send reports (here) completed within 2 or 3 seconds, and it returns the full packet size 4KB, while the sending still continues at kernel module level! this causes the timer to start for the receive and raises a timeout while the ECU is still waiting for the remaining frames of the packet.

I debugged the can-istop kernel module to see if it is reporting a complete status before it ends sending all the consecutive frames of the 4K packet, but it seems the timing seems correct at this module level ( > 5 seconds):

[178158.310957] can-isotp: isotp_sendmsg: starting txtimer for fc [178159.310929] can-isotp: isotp_tx_timer_tsklet: we did not get FC frame in time. [178159.310949] can-isotp: isotp_sendmsg: starting txtimer for fc [178159.312847] can-isotp: isotp_rcv_fc: FC frame: FS 0, BS 0, STmin 0x0A, tx_gap 10000000 [178159.312849] can-isotp: isotp_rcv_fc: starting txtimer for sending [178159.322853] can-isotp: isotp_tx_timer_tsklet: next pdu to send. [178159.332868] can-isotp: isotp_tx_timer_tsklet: next pdu to send. ...... [178165.164485] can-isotp: isotp_tx_timer_tsklet: next pdu to send. [178165.174498] can-isotp: isotp_tx_timer_tsklet: next pdu to send. [178165.174505] can-isotp: isotp_tx_timer_tsklet: we are done

@pylessard @hartkopp could you help? Being new to Python and have a very limited knowledge on C++, I'm maybe missing something about how the socket works on both layers.

pylessard commented 4 years ago

Hi @Wanam, The timeout value you are changing is the timeout of the reception thread, it is not related to UDS P2 or P2* timeout. This is just the rate at which the thread polls the socket for new data. You shouldn't have to changes this value.

In your log, I doi not see the timeout error, but you mentioned that you got a P2 timeout which is defined by UDS, not Iso-TP. The UDS timeout is set here : https://github.com/pylessard/python-udsoncan/blob/master/udsoncan/client.py#L1488

Right now, the timeout to get a response from the server starts at the beginning of the transmission. Maybe this isn't correct, but I don't have the latest UDS standard at hand and timeout are defined in it. I have the 2006 version for now. I will double-check that as soon as I can.

You can overcome your limitation by modifying the UDS timeouts (there is 3 of them), refer to the documentation here : https://udsoncan.readthedocs.io/en/latest/udsoncan/client.html#request_timeout

Let me know if that helps.

pylessard commented 4 years ago

Actually, your issue is possibly a duplicate of this issue : https://github.com/pylessard/python-udsoncan/issues/42

Only reason why I do not work on it is the fact that I don't have the standard at hand yet.

wanam commented 4 years ago

Hi Pylessard,

I think this is a different issue, in this case we have a single ack frame on the response, and the main issue in my case is that the P2 timer starts before the send is actually complete.

The timeout value I pointed before is applied also to the socket not only the receive thread: https://github.com/pylessard/python-can-isotp/blob/master/isotp/tpsock/__init__.py#L76

I didn't include the socket send timeout error in my logs, I will add the logs later, but it's happening here:

https://github.com/pylessard/python-can-isotp/blob/master/isotp/tpsock/__init__.py#L81

On Mon, Jan 13, 2020, 18:43 Pier-Yves Lessard notifications@github.com wrote:

Actually, your issue is possibly a duplicate of this issue : pylessard/python-udsoncan#42 https://github.com/pylessard/python-udsoncan/issues/42

Only reason why I do not work on it is the fact that I don't have the standard at hand yet.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pylessard/python-can-isotp/issues/21?email_source=notifications&email_token=AAFUPWZ7E2VRJAFXXHMUNVTQ5SR33A5CNFSM4KF7ZXNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIZT63A#issuecomment-573783916, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFUPW7LFTYEK7Z77J53WMTQ5SR33ANCNFSM4KF7ZXNA .

pylessard commented 4 years ago

I moved the issue to udsoncan as this is where the issue reside.

The issue is a little different than #42 , but the fix should be the same. I will advise as soon as I can see what the standard has to say about timings.

wanam commented 4 years ago

Sorry for late reply, here are the logs of the send timeout (CAN-FD, stmin = 10, tx_dl=8, packet size = 4KB):

venv/lib/python3.7/site-packages/udsoncan/client.py:123: in decorated
    return func(self, *args, **kwargs)
venv/lib/python3.7/site-packages/udsoncan/client.py:802: in transfer_data
    response = self.send_request(request)
venv/lib/python3.7/site-packages/udsoncan/client.py:1469: in send_request
    self.conn.send(payload)
fota/fota_connections.py:354: in send
    self._connection.send(payload)
venv/lib/python3.7/site-packages/udsoncan/connections.py:52: in send
    self.specific_send(payload)
venv/lib/python3.7/site-packages/udsoncan/connections.py:303: in specific_send
    self.tpsock.send(payload)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ISO-TP Socket: can0, [Normal_29bits - txid:0xXXXXXXXX, rxid:0xXXXXXXXX]>
args = (b'6\x01\xef\xbe\xad\xde\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x...\xe9\xf9\x00\x10Y\xc2\x04\x00\xe9\xf8\x01\x10\r\x00@\x02\x00\x00\x00\x00\x00\x00\x00\x80\r\x00@\x02\x00\x80\x0bT\x10',)
kwargs = {}

    def send(self, *args, **kwargs):
        if not self.bound:
            raise RuntimeError("bind() must be called before using the socket")
>       return self._socket.send(*args, **kwargs)
E       socket.timeout: timed out

venv/lib/python3.7/site-packages/isotp/tpsock/__init__.py:81: timeout
wanam commented 4 years ago

According to this preview of ISO 14229_1_2013 (page 8). P2 client starts at the confirmation of request complete.

pylessard commented 4 years ago

You are probably right, thank you. Still, this may take some time to complete as it will involve some modification to the underlying layers.

wanam commented 4 years ago

I tried to do some troubleshooting to understand this issue, changed the overall time to cover only the reception time: https://github.com/wanam/python-udsoncan/commit/6c3d06937f24710ddbd9fa928b285e15a3bd9538 , so that we avoid to report the P2 timeout while the client is still sending.

We still need to remove (or increase) the socket's timeout for send case, unfortunately it is still unclear for me why the socket returns early and reports sending all the 4kb data, while the data is still buffering on the underlying layers, as seen on the isotp kernel module.

I tried the select way (like on the socketcan here), but it didn't help.

pylessard commented 4 years ago

You don't need to play with the socket timeout. The socket is non-blocking. Everything relies on the UDS timeout which, as I said, start timing from the beginning of transmission. For now, the best you can do is make the UDS timeouts big enough

wanam commented 4 years ago

I'm surprised because if I recall during my previous debugging, I tried to force it to blocking and when I read the blocking flag of the socket I see it set to blocking while sending the data. Will increase the timeout for now, thank you for your support.

wanam commented 4 years ago

Hello @pylessard , wondering if you had time to take a look at this issue? We are still not able (without increasing sending timeout) to send 4k packets with 0x36 service over normal CAN (8 bytes).

pylessard commented 4 years ago

Hi @wanam , I did look. It is a bit tricky. I am willing to change that behaviour, but this will need to be done in all Connections. I cannot make this happens with the IsoTP Kernel driver unless @hartkopp update the driver to give a mean of knowing when the transmit buffer is empty (possibly with blocking sockets).

Right now I am keeping the project as it is, if I get the possibility to update the behaviour everywhere, I will do so.

Regards

hartkopp commented 4 years ago

Hi @pylessard and @wanam ,

I'm currently on vacation and rebuilding my kitchen - so my thoughts are currently far away from programming, which might have helped now ;-)

I added this patch https://github.com/hartkopp/can-isotp/commit/792bad1343244b8669d6434a8b22d65a15e233e6 on the 4.17+ branch which is intended for Linux mainline posting.

The patch should be applicable to the master branch too. I'll test it in the evening as I have to head back to my construction site now.

wanam commented 4 years ago

Hi @hartkopp ,

I have just tested your patch and it works fine, the patch is not fast-forward merging, I merged it on my fork: https://github.com/wanam/can-isotp/commit/216a4b184e0011f7664851ea5d6cd8138b77d23d

I had to make 2 changes on pythons libs:

And according to my tests it takes up to 6 seconds to complete the send of 4KB data over normal CAN with stmin = 10ms.

Thanks again to both of you.

hartkopp commented 4 years ago

That's great! Thanks @wanam for real world testing!!

@pylessard : would you like to do some testing too before I add this extension to the master branch?

pylessard commented 4 years ago

@hartkopp : Go ahead, don't wait on me. Thank you for your support on this!

@wanam : I will check to fix this as soon as I can. I don't have much free time ( I am not rebuilding my kitchen, but 2 kids that don't like to sleep and a job is quite enough).

wanam commented 4 years ago

@pylessard same situation with my 2 kids 😅, good luck.

pylessard commented 3 years ago

Can I close this?

wanam commented 3 years ago

Yes, it should be okay now.