ebroecker / canmatrix

Converting Can (Controller Area Network) Database Formats .arxml .dbc .dbf .kcd ...
BSD 2-Clause "Simplified" License
933 stars 401 forks source link

Speed Issue on loading Parsing ARXML CAN PDU #796

Closed xRowe closed 4 months ago

xRowe commented 5 months ago

I have two arxml files including both CAN and Ethernet PDU/message

one is v1.arxml, size 23.4M, loading CAN takes about 2s

2024-06-02 20:15:16,814 - canmatrix.formats.arxml - DEBUG - Read arxml ...
2024-06-02 20:15:17,491 - canmatrix.formats.arxml - DEBUG - 58 frames in arxml...
2024-06-02 20:15:17,512 - canmatrix.formats.arxml - DEBUG - 58 can-frame-triggering in arxml... 
2024-06-02 20:15:17,513 - canmatrix.formats.arxml - DEBUG - 0 SIGNAL-TO-PDU-MAPPINGS in arxml...
2024-06-02 20:15:17,536 - canmatrix.formats.arxml - DEBUG - 1825 I-SIGNAL-TO-I-PDU-MAPPING in arxml...

another is v2.arxml, size 33.2M, loading CAN takes about 10s

2024-06-02 20:00:54,715 - canmatrix.formats.arxml - DEBUG - Read arxml ...
2024-06-02 20:00:55,747 - canmatrix.formats.arxml - DEBUG - 173 frames in arxml...
2024-06-02 20:00:55,777 - canmatrix.formats.arxml - DEBUG - 173 can-frame-triggering in arxml...
2024-06-02 20:00:55,778 - canmatrix.formats.arxml - DEBUG - 0 SIGNAL-TO-PDU-MAPPINGS in arxml...
2024-06-02 20:00:55,811 - canmatrix.formats.arxml - DEBUG - 4542 I-SIGNAL-TO-I-PDU-MAPPING in arxml...

below are log provide by the cProfile 23_4.log vs 33_2.log

Seems that the METHOD get_short_name() takes too much time.

I am working on it but without progress, Could you help on it?

Thanks, Rowe

ebroecker commented 5 months ago

Hi @xRowe ,

We could also try to cache short-names, if we do this during fill_caches with an additional cache this could help. I'd try something like below. (Completely untested, may not work!)

--- a/src/canmatrix/formats/arxml.py
+++ b/src/canmatrix/formats/arxml.py
@@ -57,12 +57,14 @@ class Earxml:
     def __init__(self):
         self.xml_element_cache = dict()  # type: typing.Dict[str, _Element]
         self.path_cache = {}
+        self.sn_cache = {}

     def fill_caches(self, start_element=None, ar_path=""):
         if start_element is None:
             start_element = self.root
             self.path_cache = {}
         if start_element.tag == self.ns + "SHORT-NAME":
+            self.sn_cache[start_element.getparent()] = start_element.text
             return start_element.text
         for sub_element in start_element:
             text = sub_element.text
@@ -153,6 +155,7 @@ class Earxml:

     def get_short_name(self, element):
         # type: (_Element, str) -> str
+        return self.sn_cache.get(element, "")
         """Get element short name."""
         if element is None:
             return ""
xRowe commented 5 months ago

Abvoe code does not work, and not able to work. I will try in this way

ebroecker commented 5 months ago

Hi @xRowe ,

I tested my proposal and updated the patch above. For me it works now and speeds up my test by nearly 7 times!

Could you please try again? (The only change is the '.getparent()' in the cache-fill)

xRowe commented 5 months ago

Hi @ebroecker

Thank you. It does faster, now the time consumption decrease to 4s now

ebroecker commented 5 months ago

Hi @xRowe thanks for testing, I'll integrate this little caching