Open syoyo opened 1 month ago
USDA Ascii parsing is rather slow when a scene contains lots of geometry/animation data(floating point arrays)
We can import the idea from simdjson and nanocsv for faster parsing of arrays.
simdjson: https://github.com/simdjson/simdjson nanocsv: https://github.com/lighttransport/nanocsv
Optionally implement multi-threaded parsing of the string of floating-point arrays.
Dataset: Animated Knight
https://www.intel.com/content/www/us/en/developer/topic-technology/graphics-research/samples.html
Linux perf profile
Overhead comes from the use of std::iostream, std::stringstream and std::string, so we are also better to reduce the usage of C++ STL in the parser π
std::iostream
std::stringstream
std::string
Samples: 54K of event 'cycles', Event count (approx.): 53753984268 Overhead Command Shared Object Symbol 14.59% tusdcat libstdc++.so.6.0.28 [.] __dynamic_cast β 7.25% tusdcat libstdc++.so.6.0.28 [.] std::__ostream_insert<char, std::char_traits<char> > β 5.83% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::LexFloat β 4.19% tusdcat libc-2.31.so [.] __strcmp_avx2 β 4.04% tusdcat tusdcat [.] fast_float::from_chars_advanced<float, char> β 3.55% tusdcat libstdc++.so.6.0.28 [.] std::basic_streambuf<char, std::char_traits<char> >::xsputn β 3.51% tusdcat libstdc++.so.6.0.28 [.] __cxxabiv1::__vmi_class_type_info::__do_dyncast β 3.49% tusdcat libc-2.31.so [.] malloc β 2.95% tusdcat libc-2.31.so [.] _int_free β 2.86% tusdcat libstdc++.so.6.0.28 [.] std::ostream::sentry::sentry β 2.66% tusdcat libc-2.31.so [.] __memmove_avx_unaligned_erms β 2.41% tusdcat libstdc++.so.6.0.28 [.] __cxxabiv1::__si_class_type_info::__do_dyncast β 2.17% tusdcat libc-2.31.so [.] cfree@GLIBC_2.2.5 β 2.06% tusdcat libstdc++.so.6.0.28 [.] std::locale::~locale β 1.87% tusdcat libstdc++.so.6.0.28 [.] std::has_facet<std::ctype<char> > β 1.81% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::CharN β 1.75% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::SepBy1BasicType<float> β 1.70% tusdcat libstdc++.so.6.0.28 [.] std::locale::locale β 1.32% tusdcat libstdc++.so.6.0.28 [.] std::locale::operator= β 1.23% tusdcat libstdc++.so.6.0.28 [.] std::locale::id::_M_id β 1.17% tusdcat libstdc++.so.6.0.28 [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replaβ 1.12% tusdcat libstdc++.so.6.0.28 [.] std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >β 1.11% tusdcat tusdcat [.] std::vector<char, std::allocator<char> >::operator= β 1.05% tusdcat libstdc++.so.6.0.28 [.] std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::baβ 1.04% tusdcat libstdc++.so.6.0.28 [.] std::use_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >β 0.97% tusdcat libstdc++.so.6.0.28 [.] std::ios_base::_M_init β 0.97% tusdcat libstdc++.so.6.0.28 [.] std::ios_base::ios_base β 0.94% tusdcat libstdc++.so.6.0.28 [.] std::has_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >β 0.94% tusdcat libstdc++.so.6.0.28 [.] std::basic_ios<char, std::char_traits<char> >::_M_cache_locale β 0.89% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::MaybeNonFinite<float> β 0.89% tusdcat libstdc++.so.6.0.28 [.] std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overfβ 0.84% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::SkipWhitespaceAndNewline β 0.73% tusdcat tusdcat [.] tinyusdz::ascii::AsciiParser::SeekTo
USDA Ascii parsing is rather slow when a scene contains lots of geometry/animation data(floating point arrays)
We can import the idea from simdjson and nanocsv for faster parsing of arrays.
simdjson: https://github.com/simdjson/simdjson nanocsv: https://github.com/lighttransport/nanocsv
Optionally implement multi-threaded parsing of the string of floating-point arrays.
Example
Dataset: Animated Knight
https://www.intel.com/content/www/us/en/developer/topic-technology/graphics-research/samples.html
Linux perf profile
Overhead comes from the use of
std::iostream
,std::stringstream
andstd::string
, so we are also better to reduce the usage of C++ STL in the parser π