questdb / nodejs-questdb-client

QuestDB Node.js Client
35 stars 8 forks source link

chore(nodejs): regexp for table and column name validations #5

Closed glasstiger closed 2 years ago

glasstiger commented 2 years ago

Flame diagrams show that validating and writing strings are the hot places in the code, looping through the characters of strings is slow. Replacing it with regular expressions in table and column name validation and character escaping improves performance significantly. However, this is true only when the code using the client does not generate garbage. When the client code is not gc-free overall performance is about the same with regexp. In real life use cases unlikely that we will see gc-free clients so the 200k rows/s ingestion speed remains theoretical but I think it still makes sense to use regexp instead of the for loops. It can lead to performance improvements when the client code does not generate much garbage.

regex, 300k rows, strings constants (no garbage) rss:58.45MB;heapTotal:23.31MB;heapUsed:7.78MB; took: 1.561, rate: 192184.49711723256 rows/s rss:58.33MB;heapTotal:22.81MB;heapUsed:7.79MB; took: 1.559, rate: 192431.04554201412 rows/s rss:59.81MB;heapTotal:23.56MB;heapUsed:7.8MB; took: 1.535, rate: 195439.7394136808 rows/s

for loops, 3m rows, strings constants (no garbage) rss:84.56MB;heapTotal:48.56MB;heapUsed:16.89MB; took: 35.096, rate: 85479.82676088443 rows/s rss:75.73MB;heapTotal:41.56MB;heapUsed:20.52MB; took: 35.034, rate: 85631.10121596164 rows/s rss:89.25MB;heapTotal:53.06MB;heapUsed:34.04MB; took: 34.888, rate: 85989.45196055951 rows/s

regex, 3m rows, strings constants (no garbage) rss:75.84MB;heapTotal:39.31MB;heapUsed:12.16MB; took: 14.069, rate: 213234.77148340322 rows/s rss:74.95MB;heapTotal:39.81MB;heapUsed:11.31MB; took: 13.924, rate: 215455.32892846884 rows/s rss:74.8MB;heapTotal:39.06MB;heapUsed:17.09MB; took: 13.785, rate: 217627.8563656148 rows/s


- Client code generates garbage, overall performance is more or less the same.

for loops, 300k rows, unique strings (with garbage) rss:50.7MB;heapTotal:15.31MB;heapUsed:10.16MB; took: 27.773, rate: 10801.857919562164 rows/s rss:47.31MB;heapTotal:15.06MB;heapUsed:10.05MB; took: 27.871, rate: 10763.876430698576 rows/s rss:49.22MB;heapTotal:15.31MB;heapUsed:7.1MB; took: 27.846, rate: 10773.540185304892 rows/s

regex, 300k rows, unique strings (with garbage) rss:50.02MB;heapTotal:15.06MB;heapUsed:9.15MB; took: 27.264, rate: 11003.521126760563 rows/s rss:49.45MB;heapTotal:15.06MB;heapUsed:8.26MB; took: 27.721, rate: 10822.120414126475 rows/s rss:50MB;heapTotal:15.06MB;heapUsed:6.52MB; took: 27.493, rate: 10911.868475611975 rows/s

for loops, 500k rows, unique strings (with garbage) rss:62.16MB;heapTotal:24.31MB;heapUsed:8.69MB; took: 102.829, rate: 4862.441529140612 rows/s rss:61.53MB;heapTotal:24.31MB;heapUsed:10.96MB; took: 101.525, rate: 4924.895345973898 rows/s rss:61.16MB;heapTotal:24.31MB;heapUsed:12.56MB; took: 101.551, rate: 4923.634429990842 rows/s

regex, 500k rows, unique strings (with garbage) rss:61.55MB;heapTotal:24.56MB;heapUsed:9.03MB; took: 102.877, rate: 4860.172827745755 rows/s rss:60.97MB;heapTotal:24.31MB;heapUsed:12.37MB; took: 101.236, rate: 4938.95452210676 rows/s rss:61.97MB;heapTotal:24.06MB;heapUsed:11.4MB; took: 100.424, rate: 4978.889508484028 rows/s

glasstiger commented 2 years ago

Closing it as the conclusion of our discussions on Slack is that the change is unlikely to bring performance improvements for end use cases.