updated fast-json-stringify

Uzlopak commented 1 year ago

Can you please update the benchmarks?

Btw. I updated the serializer for asString and I think you are using a similar serializer for strings. Maybe you want to adapt it for more performance. https://github.com/fastify/fast-json-stringify/commit/17bb4c2430c60a44079ef572766a8150ab70aefc#diff-8e3d45dd0e9ec504195499d7fafe3efc08756c349fce602ef6538c593aa563d8

Also the benchmarks are a little bit unfair. E.g. you are benching for stringify but fast-json-stringify is not only stringifying but also does some assertions, like checking for required fields.

samchon commented 1 year ago

https://dev.to/samchon/good-bye-typescript-is-ancestor-of-typia-20000x-faster-validator-49fi

Around the same time, I had made another library named typescript-json.

It performs AoT (Ahead of Time) compliation like typescript-is, but it was not for runtime validation, but for JSON schema generation. About JSON serialization boosting, typescript-json had utilized fast-json-stringify by automatically generated JSON schema.

For reference, purpose of typescript-json was to accomplish below nestia, generating Swagger Documents with pure TypeScript type.

Thanks for notification. As typia had started from wrapper of fast-json-stringify (at that time, package name was typescript-json), I'm still using string serializer of fast-json-stringify. I'll update string serializer following fast-json-stringify. Also, I'll perform benchmark again with latest version of fast-json-stringify.

By the way, typia has four stringify functions and assertStringify is simliar with fast-json-stringify. Benchmark result is showing all of those typia stringify functions (https://typia.io/docs/json/stringify/#performance), but as normal users don't know the difference, I can understand why you feel it as unfair. About this topic, how about adding a comment that fast-json-stringify is similar with typia.assertStringify() under the graph? If you want, you can edit https://github.com/samchon/typia/blob/master/website/pages/docs/json/stringify.mdx#L497-L507, and send a PR for that.

stringify
isStringify
assertStringify
validateStringify

Uzlopak commented 1 year ago

Sure, I will provide a PR.

Please check if the patch for serializer for asString makes a perf boost. If you see a performance regression, than I would need to investigate.

samchon commented 1 year ago

You can measure benchmark of typia.stringify<T>() function by below commands.

# PREVIOUS SMALL-STRING
git clone https://github.com/samchon/typia -b features/original typia@original
cd typia@original
npm install
npm run benchmark

# ADVANCED SMALL-STRING
cd ..
git clone https://github.com/samchon/typia -b features/stringify typia@advanced
cd typia@advanced
npm install
npm run benchmark

By the way, as my benchmark program of stringify function handling composite types, it would much better to make a new dedicated benchmark function for only your asStringSmall() function.

samchon commented 1 year ago

https://github.com/samchon/typia/blob/features/stringify/test/issues/736.ts

I made a dedicated benchmark, and you can run it through:

git clone https://github.com/samchon/typia -b features/stringify typia@stringify
cd typia@stringify
npm install
npm run issue 736

samchon commented 1 year ago

@Uzlopak Fixed length to be 41, and new algorithm became slower than before

previous 6867.585931699803
advanced 6861.237658316788

When fix length to be 5, and new algorithm is still slower

previous 33501.367175249296
advanced 33126.283614757245

Uzlopak commented 1 year ago

Yes, the number 42 is super arbitrary. I asked @mcollina why 42, but it seems it was a random number.

It should be investigated when the breaking point is useful.

Uzlopak commented 1 year ago

didnt you write that in a special case it was double the perf?

samchon commented 1 year ago

@Uzlopak Oops, I missed the STR_ESCAPE pattern checking.

After adjusting that, new algorithm became faster when length 41. However, length 5 became slower.

The 2x faster was by my mistake that skipping length > 41 checking.

# LENGTH 41
previous 5193.547705809435
advanced 5982.83813974943

# LENGTH 5
previous 23195.423057822234
advanced 20127.113157546624

Uzlopak commented 1 year ago

Your code is different from ours.

Your code is length > 42 and call JSON.stringify, str escape check and wrap in double quotes, then fallback to simple case

It has to be length < 42 do simple case, str escape check and wrap in double quotes, fallback to JSON.stringify

samchon commented 1 year ago

@Uzlopak Oops, did lots of mistake. I repeated it carefully, and this may be the final result.

Sorry for repeated mistakes. It was so confusing because I did another at the same time.

Tried sequence of if conditions, but it was not a matter.

# LENGTH 41
previous 6739.206407670302
advanced 19771.155382046658

# LENGTH 5
previous 30564.183105977772
advanced 32337.116211885812

Uzlopak commented 1 year ago

Can you investigate if 42 as string length is optimal?

samchon commented 1 year ago

Well, as regex format condition newly added, I should consider which string to be used.

Do you have any idea about it?

Uzlopak commented 1 year ago

Make a very string matching this regex /[a-z0-9]/+. At the end put a double quote "

samchon commented 1 year ago

Normal characters

typia@4.1.15 issue node test/issue 736-normal

Limit	Native	Optimized	Gap
#10	13,231	35,170	165.82 %
#20	10,373	20,038	93.17 %
#30	8,009	13,359	66.81 %
#40	5,733	8,997	56.93 %
#50	5,005	7,741	54.65 %
#60	5,260	7,851	49.28 %
#70	4,536	5,658	24.76 %
#80	3,637	4,928	35.49 %
#90	3,364	4,207	25.06 %
#100	2,504	3,829	52.92 %
#200	1,913	1,932	1.02 %
#300	1,312	1,359	3.6 %
#400	1,052	1,006	-4.41 %
#500	826	857	3.74 %
#600	781	884	13.17 %
#700	761	780	2.49 %
#800	669	552	-17.5 %
#900	521	478	-8.19 %
#1,000	449	462	2.96 %

Surrounded by `"` characters

typia@4.1.15 issue node test/issue 736-special

Limit	Native	Optimized	Gap
#10	11,640	14,012	20.38 %
#20	9,499	14,581	53.5 %
#30	6,159	10,490	70.31 %
#40	6,104	9,964	63.24 %
#50	5,640	8,743	55.03 %
#60	5,454	7,924	45.29 %
#70	4,794	6,160	28.48 %
#80	3,978	4,740	19.15 %
#90	3,949	5,202	31.71 %
#100	3,170	4,363	37.65 %
#200	1,983	2,673	34.78 %
#300	1,323	1,569	18.64 %
#400	1,062	1,355	27.56 %
#500	907	1,151	26.99 %
#600	814	963	18.29 %
#700	694	847	22.17 %
#800	642	746	16.12 %
#900	592	652	10.15 %
#1,000	509	575	13.12 %

Only regex pattern

typia@4.1.15 issue node test/issue 736-regex

Limit	Native	Optimized	Gap
#10	12,645	28,811	127.84 %
#20	10,155	26,702	162.93 %
#30	8,256	22,639	174.21 %
#40	6,658	23,138	247.53 %
#50	5,996	19,395	223.45 %
#60	5,562	15,214	173.54 %
#70	4,349	13,053	200.17 %
#80	4,028	11,180	177.55 %
#90	3,683	9,873	168.04 %
#100	3,027	9,763	222.49 %
#200	1,904	5,624	195.35 %
#300	1,467	4,105	179.73 %
#400	1,102	3,325	201.69 %
#500	886	2,721	207.22 %
#600	773	2,318	199.76 %
#700	687	2,056	199.52 %
#800	610	1,751	187.34 %
#900	570	1,239	117.42 %
#1,000	483	1,752	262.94 %

samchon commented 1 year ago

You also run that command, and determine which length to be use.

In my opinion, the length 42 is reasonable because 50 seems like the diminishing margin.

samchon commented 1 year ago

Comparing regex and special case, current code seems reasonable.

Diminishing margin of manual serialization logic is about 40 to 50.

Also, even though target string over the 42 length, regex pattern extremely diminish the serialization time.

Uzlopak commented 1 year ago

Currently running the benchmarks:

https://github.com/fastify/fast-json-stringify/pull/637

samchon commented 1 year ago

I'm just confused by only regex filtered case. It is even faster than optimized case when no special character exists.

When special character exists, advanced manual stringify logic is faster, so I'm considering below implementation.

How do you think about below code, @Uzlopak ?

export const $string = (str: string): string => {
    if (STR_ESCAPE.test(str) === false) 
        return `"${str}"`;

    if (str.length > 41)
        return JSON.stringify(str);

    ...OPTIMIZED LOGIC
}

samchon commented 1 year ago

Currently running the benchmarks:

fastify/fast-json-stringify#637

Great enhancement on short string, but short string with double quote be decreased.

Uzlopak commented 1 year ago

In my opinion:

The regex is theoretically always slower than processing every character in a for loop. Doing first the regex, means that this is the geneeral bottleneck. even though the optimized logic is also handling the same unicoode and double quotes etc.

The length check seems to be the cheapest operation from all. So thats why I would do that one first.

Uzlopak commented 1 year ago

benchmarking now your consideration

https://github.com/fastify/fast-json-stringify/pull/637

Uzlopak commented 1 year ago

Always a tradeoff. The expectation should be actually that short string without escape characters are the majority and the strings with escape characters are exception. Also short strings are more common than huge strings

So personally I think

export const $string = (str: string): string => {
    if (STR_ESCAPE.test(str) === false) 
        return `"${str}"`;

    if (str.length > 41)
        return JSON.stringify(str);

    ...OPTIMIZED LOGIC
}

is the better tradeoff regarding the benchmarks.

But somehow the benchmarks are counter intuitive

samchon / typia