Closed Uzlopak closed 1 year ago
https://dev.to/samchon/good-bye-typescript-is-ancestor-of-typia-20000x-faster-validator-49fi
Around the same time, I had made another library named typescript-json.
It performs AoT (Ahead of Time) compliation like typescript-is, but it was not for runtime validation, but for JSON schema generation. About JSON serialization boosting, typescript-json had utilized fast-json-stringify by automatically generated JSON schema.
For reference, purpose of typescript-json was to accomplish below nestia, generating Swagger Documents with pure TypeScript type.
Thanks for notification. As typia
had started from wrapper of fast-json-stringify
(at that time, package name was typescript-json
), I'm still using string serializer of fast-json-stringify
. I'll update string serializer following fast-json-stringify
. Also, I'll perform benchmark again with latest version of fast-json-stringify
.
By the way, typia
has four stringify functions and assertStringify
is simliar with fast-json-stringify
. Benchmark result is showing all of those typia
stringify functions (https://typia.io/docs/json/stringify/#performance), but as normal users don't know the difference, I can understand why you feel it as unfair. About this topic, how about adding a comment that fast-json-stringify
is similar with typia.assertStringify()
under the graph? If you want, you can edit https://github.com/samchon/typia/blob/master/website/pages/docs/json/stringify.mdx#L497-L507, and send a PR for that.
stringify
isStringify
assertStringify
validateStringify
Sure, I will provide a PR.
Please check if the patch for serializer for asString makes a perf boost. If you see a performance regression, than I would need to investigate.
You can measure benchmark of typia.stringify<T>()
function by below commands.
# PREVIOUS SMALL-STRING
git clone https://github.com/samchon/typia -b features/original typia@original
cd typia@original
npm install
npm run benchmark
# ADVANCED SMALL-STRING
cd ..
git clone https://github.com/samchon/typia -b features/stringify typia@advanced
cd typia@advanced
npm install
npm run benchmark
By the way, as my benchmark program of stringify
function handling composite types, it would much better to make a new dedicated benchmark function for only your asStringSmall()
function.
https://github.com/samchon/typia/blob/features/stringify/test/issues/736.ts
I made a dedicated benchmark, and you can run it through:
git clone https://github.com/samchon/typia -b features/stringify typia@stringify
cd typia@stringify
npm install
npm run issue 736
@Uzlopak Fixed length to be 41, and new algorithm became slower than before
previous 6867.585931699803
advanced 6861.237658316788
When fix length to be 5, and new algorithm is still slower
previous 33501.367175249296
advanced 33126.283614757245
Yes, the number 42 is super arbitrary. I asked @mcollina why 42, but it seems it was a random number.
It should be investigated when the breaking point is useful.
didnt you write that in a special case it was double the perf?
@Uzlopak Oops, I missed the STR_ESCAPE
pattern checking.
After adjusting that, new algorithm became faster when length 41. However, length 5 became slower.
The 2x faster was by my mistake that skipping
length > 41
checking.
# LENGTH 41
previous 5193.547705809435
advanced 5982.83813974943
# LENGTH 5
previous 23195.423057822234
advanced 20127.113157546624
Your code is different from ours.
Your code is length > 42 and call JSON.stringify, str escape check and wrap in double quotes, then fallback to simple case
It has to be length < 42 do simple case, str escape check and wrap in double quotes, fallback to JSON.stringify
@Uzlopak Oops, did lots of mistake. I repeated it carefully, and this may be the final result.
Sorry for repeated mistakes. It was so confusing because I did another at the same time.
Tried sequence of if conditions, but it was not a matter.
# LENGTH 41
previous 6739.206407670302
advanced 19771.155382046658
# LENGTH 5
previous 30564.183105977772
advanced 32337.116211885812
Can you investigate if 42 as string length is optimal?
Well, as regex format condition newly added, I should consider which string to be used.
Do you have any idea about it?
Make a very string matching this regex /[a-z0-9]/+
. At the end put a double quote "
typia@4.1.15 issue node test/issue 736-normal
Limit | Native | Optimized | Gap |
---|---|---|---|
#10 | 13,231 | 35,170 | 165.82 % |
#20 | 10,373 | 20,038 | 93.17 % |
#30 | 8,009 | 13,359 | 66.81 % |
#40 | 5,733 | 8,997 | 56.93 % |
#50 | 5,005 | 7,741 | 54.65 % |
#60 | 5,260 | 7,851 | 49.28 % |
#70 | 4,536 | 5,658 | 24.76 % |
#80 | 3,637 | 4,928 | 35.49 % |
#90 | 3,364 | 4,207 | 25.06 % |
#100 | 2,504 | 3,829 | 52.92 % |
#200 | 1,913 | 1,932 | 1.02 % |
#300 | 1,312 | 1,359 | 3.6 % |
#400 | 1,052 | 1,006 | -4.41 % |
#500 | 826 | 857 | 3.74 % |
#600 | 781 | 884 | 13.17 % |
#700 | 761 | 780 | 2.49 % |
#800 | 669 | 552 | -17.5 % |
#900 | 521 | 478 | -8.19 % |
#1,000 | 449 | 462 | 2.96 % |
"
characterstypia@4.1.15 issue node test/issue 736-special
Limit | Native | Optimized | Gap |
---|---|---|---|
#10 | 11,640 | 14,012 | 20.38 % |
#20 | 9,499 | 14,581 | 53.5 % |
#30 | 6,159 | 10,490 | 70.31 % |
#40 | 6,104 | 9,964 | 63.24 % |
#50 | 5,640 | 8,743 | 55.03 % |
#60 | 5,454 | 7,924 | 45.29 % |
#70 | 4,794 | 6,160 | 28.48 % |
#80 | 3,978 | 4,740 | 19.15 % |
#90 | 3,949 | 5,202 | 31.71 % |
#100 | 3,170 | 4,363 | 37.65 % |
#200 | 1,983 | 2,673 | 34.78 % |
#300 | 1,323 | 1,569 | 18.64 % |
#400 | 1,062 | 1,355 | 27.56 % |
#500 | 907 | 1,151 | 26.99 % |
#600 | 814 | 963 | 18.29 % |
#700 | 694 | 847 | 22.17 % |
#800 | 642 | 746 | 16.12 % |
#900 | 592 | 652 | 10.15 % |
#1,000 | 509 | 575 | 13.12 % |
typia@4.1.15 issue node test/issue 736-regex
Limit | Native | Optimized | Gap |
---|---|---|---|
#10 | 12,645 | 28,811 | 127.84 % |
#20 | 10,155 | 26,702 | 162.93 % |
#30 | 8,256 | 22,639 | 174.21 % |
#40 | 6,658 | 23,138 | 247.53 % |
#50 | 5,996 | 19,395 | 223.45 % |
#60 | 5,562 | 15,214 | 173.54 % |
#70 | 4,349 | 13,053 | 200.17 % |
#80 | 4,028 | 11,180 | 177.55 % |
#90 | 3,683 | 9,873 | 168.04 % |
#100 | 3,027 | 9,763 | 222.49 % |
#200 | 1,904 | 5,624 | 195.35 % |
#300 | 1,467 | 4,105 | 179.73 % |
#400 | 1,102 | 3,325 | 201.69 % |
#500 | 886 | 2,721 | 207.22 % |
#600 | 773 | 2,318 | 199.76 % |
#700 | 687 | 2,056 | 199.52 % |
#800 | 610 | 1,751 | 187.34 % |
#900 | 570 | 1,239 | 117.42 % |
#1,000 | 483 | 1,752 | 262.94 % |
You also run that command, and determine which length to be use.
In my opinion, the length 42 is reasonable because 50 seems like the diminishing margin.
Comparing regex
and special
case, current code seems reasonable.
Diminishing margin of manual serialization logic is about 40 to 50.
Also, even though target string over the 42 length, regex pattern extremely diminish the serialization time.
Currently running the benchmarks:
I'm just confused by only regex filtered case. It is even faster than optimized case when no special character exists.
When special character exists, advanced manual stringify logic is faster, so I'm considering below implementation.
How do you think about below code, @Uzlopak ?
export const $string = (str: string): string => {
if (STR_ESCAPE.test(str) === false)
return `"${str}"`;
if (str.length > 41)
return JSON.stringify(str);
...OPTIMIZED LOGIC
}
Currently running the benchmarks:
Great enhancement on short string, but short string with double quote be decreased.
In my opinion:
The regex is theoretically always slower than processing every character in a for loop. Doing first the regex, means that this is the geneeral bottleneck. even though the optimized logic is also handling the same unicoode and double quotes etc.
The length check seems to be the cheapest operation from all. So thats why I would do that one first.
benchmarking now your consideration
Always a tradeoff. The expectation should be actually that short string without escape characters are the majority and the strings with escape characters are exception. Also short strings are more common than huge strings
So personally I think
export const $string = (str: string): string => {
if (STR_ESCAPE.test(str) === false)
return `"${str}"`;
if (str.length > 41)
return JSON.stringify(str);
...OPTIMIZED LOGIC
}
is the better tradeoff regarding the benchmarks.
But somehow the benchmarks are counter intuitive
Can you please update the benchmarks?
Btw. I updated the serializer for asString and I think you are using a similar serializer for strings. Maybe you want to adapt it for more performance. https://github.com/fastify/fast-json-stringify/commit/17bb4c2430c60a44079ef572766a8150ab70aefc#diff-8e3d45dd0e9ec504195499d7fafe3efc08756c349fce602ef6538c593aa563d8
Also the benchmarks are a little bit unfair. E.g. you are benching for
stringify
but fast-json-stringify is not only stringifying but also does some assertions, like checking for required fields.