syoyo / tinyusdz

Tiny, dependency-free USDZ/USDA/USDC library written in C++14
Other
453 stars 32 forks source link

[TODO] Optimize Ascii parser #164

Open syoyo opened 1 month ago

syoyo commented 1 month ago

USDA Ascii parsing is rather slow when a scene contains lots of geometry/animation data(floating point arrays)

We can import the idea from simdjson and nanocsv for faster parsing of arrays.

simdjson: https://github.com/simdjson/simdjson nanocsv: https://github.com/lighttransport/nanocsv

Optionally implement multi-threaded parsing of the string of floating-point arrays.

Example

Dataset: Animated Knight

https://www.intel.com/content/www/us/en/developer/topic-technology/graphics-research/samples.html

Linux perf profile

Overhead comes from the use of std::iostream, std::stringstream and std::string, so we are also better to reduce the usage of C++ STL in the parser πŸ™‚

Samples: 54K of event 'cycles', Event count (approx.): 53753984268
Overhead  Command  Shared Object        Symbol
  14.59%  tusdcat  libstdc++.so.6.0.28  [.] __dynamic_cast                                                                           β—†
   7.25%  tusdcat  libstdc++.so.6.0.28  [.] std::__ostream_insert<char, std::char_traits<char> >                                     β–’
   5.83%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::LexFloat                                                   β–’
   4.19%  tusdcat  libc-2.31.so         [.] __strcmp_avx2                                                                            β–’
   4.04%  tusdcat  tusdcat              [.] fast_float::from_chars_advanced<float, char>                                             β–’
   3.55%  tusdcat  libstdc++.so.6.0.28  [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn                              β–’
   3.51%  tusdcat  libstdc++.so.6.0.28  [.] __cxxabiv1::__vmi_class_type_info::__do_dyncast                                          β–’
   3.49%  tusdcat  libc-2.31.so         [.] malloc                                                                                   β–’
   2.95%  tusdcat  libc-2.31.so         [.] _int_free                                                                                β–’
   2.86%  tusdcat  libstdc++.so.6.0.28  [.] std::ostream::sentry::sentry                                                             β–’
   2.66%  tusdcat  libc-2.31.so         [.] __memmove_avx_unaligned_erms                                                             β–’
   2.41%  tusdcat  libstdc++.so.6.0.28  [.] __cxxabiv1::__si_class_type_info::__do_dyncast                                           β–’
   2.17%  tusdcat  libc-2.31.so         [.] cfree@GLIBC_2.2.5                                                                        β–’
   2.06%  tusdcat  libstdc++.so.6.0.28  [.] std::locale::~locale                                                                     β–’
   1.87%  tusdcat  libstdc++.so.6.0.28  [.] std::has_facet<std::ctype<char> >                                                        β–’
   1.81%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::CharN                                                      β–’
   1.75%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::SepBy1BasicType<float>                                     β–’
   1.70%  tusdcat  libstdc++.so.6.0.28  [.] std::locale::locale                                                                      β–’
   1.32%  tusdcat  libstdc++.so.6.0.28  [.] std::locale::operator=                                                                   β–’
   1.23%  tusdcat  libstdc++.so.6.0.28  [.] std::locale::id::_M_id                                                                   β–’
   1.17%  tusdcat  libstdc++.so.6.0.28  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replaβ–’
   1.12%  tusdcat  libstdc++.so.6.0.28  [.] std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >β–’
   1.11%  tusdcat  tusdcat              [.] std::vector<char, std::allocator<char> >::operator=                                      β–’
   1.05%  tusdcat  libstdc++.so.6.0.28  [.] std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::baβ–’
   1.04%  tusdcat  libstdc++.so.6.0.28  [.] std::use_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >β–’
   0.97%  tusdcat  libstdc++.so.6.0.28  [.] std::ios_base::_M_init                                                                   β–’
   0.97%  tusdcat  libstdc++.so.6.0.28  [.] std::ios_base::ios_base                                                                  β–’
   0.94%  tusdcat  libstdc++.so.6.0.28  [.] std::has_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >β–’
   0.94%  tusdcat  libstdc++.so.6.0.28  [.] std::basic_ios<char, std::char_traits<char> >::_M_cache_locale                           β–’
   0.89%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::MaybeNonFinite<float>                                      β–’
   0.89%  tusdcat  libstdc++.so.6.0.28  [.] std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overfβ–’
   0.84%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::SkipWhitespaceAndNewline                                   β–’
   0.73%  tusdcat  tusdcat              [.] tinyusdz::ascii::AsciiParser::SeekTo