stevearc / godot_parser

Python library for parsing Godot scene files
MIT License
55 stars 11 forks source link

Parsing large scene is slow. #4

Open rambda opened 3 years ago

rambda commented 3 years ago

Parsing small scenes works really well.

And, I have this 50k lines, 67mb scene file converted from an .escn file imported from blender. I wanted to shrink its size from the source. I wrote some regex-based scripts and it was neither elegant nor convinent, then I found this parser.

However, It took 11 minutes to parse this 67mb scene. RAM usage increased very slowly and the parsing was using only 6% CPU usage of a Ryzen 2700X.

67mb lines is definitely large, but I think 11 minutes is a bit much. Is it a normal speed with pyparsing or does the script need deeper optimization idk.

City.zip

stevearc commented 3 years ago

I just ran a profile:

         1854956389 function calls (1569503340 primitive calls) in 962.737 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
206097521/4  329.244    0.000  962.345  240.586 pyparsing.py:1647(_parseNoCache)
31634631/42532   87.595    0.000  960.311    0.023 pyparsing.py:4249(parseImpl)
126832269/126832267   68.611    0.000   75.893    0.000 pyparsing.py:554(__init__)
 47417969   56.957    0.000   73.587    0.000 pyparsing.py:2952(parseImpl)
142460040   51.347    0.000   62.615    0.000 pyparsing.py:1628(preParse)
 47373439   46.668    0.000  120.968    0.000 pyparsing.py:3339(parseImpl)
126832269   41.349    0.000   65.571    0.000 pyparsing.py:545(__new__)
 94893100   35.400    0.000   35.400    0.000 pyparsing.py:304(__init__)
15938450/4   33.649    0.000  962.344  240.586 pyparsing.py:4049(parseImpl)
 47449936   24.410    0.000   24.410    0.000 {method 'match' of 're.Pattern' objects}
238343406   21.462    0.000   21.462    0.000 {built-in method builtins.isinstance}
285099876/285099661   20.788    0.000   20.788    0.000 {built-in method builtins.len}
31838224/74564   20.082    0.000  961.463    0.013 pyparsing.py:4460(parseImpl)
 31662221   18.002    0.000   24.588    0.000 pyparsing.py:852(__iadd__)
  23784/4   12.531    0.001  962.344  240.586 pyparsing.py:4686(parseImpl)
 63527296   11.896    0.000   11.896    0.000 {built-in method __new__ of type object at 0x907780}
 15780946   11.660    0.000   26.752    0.000 pyparsing.py:5786(pa)
 15843988   11.516    0.000   18.215    0.000 pyparsing.py:3514(parseImpl)

Pretty much all the time is in pyparsing. I more or less expected this, as pyparsing is written in Python and its goal is to make it super easy to write a parser, not necessarily to build the fastest parsers. I think for these larger files you may be better off trying a different tool :/

You can always try using a tool script from within Godot!