PixarAnimationStudios / OpenUSD

Universal Scene Description
http://www.openusd.org
Other
6.17k stars 1.23k forks source link

Simplify grammar when parsing large strings #3293

Closed nvmkuruc closed 2 months ago

nvmkuruc commented 2 months ago

Description of Change(s)

A performance regression was observed when parsing large strings after the merging of #3005.

Parsing time for the schema registry layers (which has large documentation strings) went from ~4-5ms to over 7ms.

This was traced back to the String parsing grammar. One notable difference from the PEGTL implementation and the original implementation was that even though it permissively let any character be "escaped" by \, it was explicit in the grammar about those that would be considered valid for replacement by TfEscapeStringReplaceChar. This was unnecessary and appeared to slow down the parser. (Were the grammar ever to become less permissive, it also had a bug-- as OctDigit was incorrectly specified.) These unnecessary checks and rules have now been removed.

This change includes some additional optimizations and simplifications to the grammar.

env PXR_ENABLE_GLOBAL_TRACE=1 PXR_WORK_THREAD_LIMIT=1 python3 -c "from pxr import Usd; Usd.SchemaRegistry()"
    4.410 ms     4.410 ms      14 samples    |   | pxrInternal_v0_24_11__pxrReserved__::Sdf_ParseLayer

Fixes Issue(s)

-

jesschimein commented 2 months ago

Filed as internal issue #USD-10128

jesschimein commented 2 months ago

/AzurePipelines run

azure-pipelines[bot] commented 2 months ago
Azure Pipelines successfully started running 1 pipeline(s).