christoph2 / pyxcp

ASAM XCP in Python
http://pyxcp.rtfd.org
GNU Lesser General Public License v3.0
197 stars 63 forks source link

errorhandler improvements: infinite loop due to preAction errors and … #122

Closed DevD-official closed 1 year ago

DevD-official commented 1 year ago

…complete clearing of handlerStack

Proposed changes

Types of changes

DevD-official commented 1 year ago

Please review my pull request.

DevD-official commented 1 year ago

Note: The problem mentioned in #108 will be addressed here.

christoph2 commented 1 year ago

Looks very promising! I was just testing around with some debug print statements, and I think the problem is, that every pre-condition needs its own context/stack-frame (represented by self.handlerStack)

Let's give it a try 😃 !

DevD-official commented 1 year ago

Yes. I agree with you. Let me know your feedback once you try out the changes. 😃

christoph2 commented 1 year ago

There is still a minimal problem (endless loop is back again), because on every iteration Repeater constructor is called again and repeat method (also containing a print statement) is seemingly never called...

ERROR:pyxcp.pyxcp.master.errorhandler:XcpTimeoutError [Response timed out (timeout=2.0s)]
                init REPEATER 2
DEBUG:pyxcp.transport.Base:SYNCH
DEBUG:pyxcp.transport.Base:-> [01 00 12 00 fc]
DEBUG:pyxcp.transport.Base:<- L2 C78 [fe 00]
DEBUG:pyxcp.transport.Base:DISCONNECT
DEBUG:pyxcp.transport.Base:-> [01 00 13 00 fe]
[ERROR (pyxcp.pyxcp.master.errorhandler)]: XcpTimeoutError [Response timed out (timeout=2.0s)]
[ERROR (pyxcp.pyxcp.master.errorhandler)]: XcpTimeoutError [Response timed out (timeout=2.0s)]
ERROR:pyxcp.pyxcp.master.errorhandler:XcpTimeoutError [Response timed out (timeout=2.0s)]
                init REPEATER 2
DEBUG:pyxcp.transport.Base:SYNCH
DEBUG:pyxcp.transport.Base:-> [01 00 14 00 fc]
DEBUG:pyxcp.transport.Base:<- L2 C79 [fe 00]
DEBUG:pyxcp.transport.Base:DISCONNECT
DEBUG:pyxcp.transport.Base:-> [01 00 15 00 fe]
[ERROR (pyxcp.pyxcp.master.errorhandler)]: XcpTimeoutError [Response timed out (timeout=2.0s)]
[ERROR (pyxcp.pyxcp.master.errorhandler)]: XcpTimeoutError [Response timed out (timeout=2.0s)]
ERROR:pyxcp.pyxcp.master.errorhandler:XcpTimeoutError [Response timed out (timeout=2.0s)]
                init REPEATER 2
DEBUG:pyxcp.transport.Base:SYNCH
DEBUG:pyxcp.transport.Base:-> [01 00 16 00 fc]
DEBUG:pyxcp.transport.Base:<- L2 C80 [fe 00]
DEBUG:pyxcp.transport.Base:DISCONNECT
DEBUG:pyxcp.transport.Base:-> [01 00 17 00 fe]
christoph2 commented 1 year ago

Hmm, I think we also need to stack repeater instances !?

DevD-official commented 1 year ago

When there is an error; the repeater instance is stored in the handler (that means it is part of the stack). But the scenario reported by you is an interesting one. Here preAction was successful. I think I can take a look at it.

christoph2 commented 1 year ago

Finally, errorhandler seens to work as expected -- incl. nested errors (I've also commented out positive response of SYNCH).

DevD-official commented 1 year ago

Thanks for the feedback. A positive response from the preAction is a valid scenario. In the upcoming PR, I will address this.

JavierCorado commented 11 months ago

Has there been any more testing or issues related to infinite loop?

I am currently using the library to connect through the eth transport layer via UDP. While testing a negative scenario of my code found out that while the ECU is asleep and this code gets executed e.g:

pyxcp = Master("eth", config={"HOST": self.ip, PORT": self.port, "PROTOCOL": self.eth_protocol}) pyxcp.connect()

An infinite loop is triggered.

It just loops, console output: ENTER handler: 0x7f10502fe580 EXEC 0x7f10502fe580 ERROR:pyxcp.errorhandler:XcpTimeoutError [Response timed out (timeout=2.0s)] ERR_TIMEOUT

christoph2 commented 11 months ago

This is intentional, according to XCP spec repeat ∞ times on CONNECT timeout.

tttech-miori commented 11 months ago

This is intentional, according to XCP spec repeat ∞ times on CONNECT timeout.

I agree that the standard says so, but I also reckon that real-life applications cannot block indenfinitely. IIRC it is all described here https://github.com/christoph2/pyxcp/issues/108.

My idea of reading the standard in that case is: "You cannot connect? You should try again" but we should handle that, specifically for connect(), as a user thing to do e.g. calling the connect() method re-trying that way instead of internally blocking forever.

Typical XCP use-case is getting or setting variables while testing an ECU automatically. If ECU fails to connect, the test executor hangs indefinitely without the possibility of e.g. writing the result to a report file.

Therefore, I suggest to let connect() method to "fail" returning (exception, statuc code ... the best that fits with the current API design) after the first attempt (or the 3 sync attempts, the standard is not so clear there, at least for me)

JavierCorado commented 11 months ago

An optional setting that triggers the handling @ttlorenz mentioned could be an interesting solution in the long run.

With the explanation that @christoph2 gave I now understand why it loops infinitely, but also in the specific context of using the eth transport layer with UDP, what would be the importance to loop over until it connects? At the end of the day, UDP is not expecting any response whatsoever.

Perhaps opening a specific discussion about this would be the best approach to start following this and link future PRs.

christoph2 commented 11 months ago

I think the right place for user-defined actions in the presence of errors is the new policy mechanism; i.e. on error invoke a call-out method, return code than determines how to continue.