influxdata / line-protocol

MIT License
38 stars 10 forks source link

influxdata: check utf-8 validity of decoded input #40

Closed rogpeppe closed 3 years ago

rogpeppe commented 3 years ago

This slows things down quite a bit, but it needs to be done.

Also use the corpus directly from the line-protocol-corpus repository and update the associated code.

name                                                           old time/op    new time/op    delta
DecodeEntriesSkipping/long-lines-8                               21.9ms ± 1%    26.4ms ± 1%  +20.51%  (p=0.008 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                  23.4ms ± 3%    30.1ms ± 0%  +28.28%  (p=0.016 n=5+4)
DecodeEntriesSkipping/single-short-line-8                         315ns ± 1%     425ns ± 0%  +34.70%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8            318ns ± 1%     435ns ± 1%  +36.70%  (p=0.008 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                          188ms ± 1%     281ms ± 1%  +49.35%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8            279ns ± 1%     397ns ± 5%  +42.18%  (p=0.008 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8             339ns ± 0%     484ns ± 5%  +42.92%  (p=0.008 n=5+5)
DecodeEntriesSkipping/procstat-8                                 3.26µs ± 0%    5.78µs ± 4%  +77.55%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                        25.4ms ± 0%    27.3ms ± 5%   +7.81%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           97.4ms ± 0%   104.1ms ± 2%   +6.89%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8                  341ns ± 1%     443ns ± 4%  +29.95%  (p=0.016 n=4+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8     368ns ± 6%     430ns ± 1%  +16.63%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                   215ms ± 1%     268ms ± 1%  +24.90%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8     304ns ± 0%     368ns ± 0%  +21.05%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8      378ns ± 0%     462ns ± 0%  +22.46%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                          5.33µs ± 1%    8.21µs ± 0%  +54.14%  (p=0.008 n=5+5)
Encode/strict/100-points-8                                       86.7µs ±19%    89.7µs ±42%     ~     (p=0.690 n=5+5)
Encode/strict/1-point-8                                           759ns ± 0%     756ns ± 1%     ~     (p=0.310 n=5+5)
Encode/lax/100-points-8                                          57.0µs ± 1%    56.3µs ± 1%     ~     (p=0.057 n=4+4)
Encode/lax/1-point-8                                              575ns ± 1%     563ns ± 3%     ~     (p=0.095 n=5+5)

name                                                           old speed      new speed      delta
DecodeEntriesSkipping/long-lines-8                             1.20GB/s ± 1%  0.99GB/s ± 1%  -17.02%  (p=0.008 n=5+5)
DecodeEntriesSkipping/long-lines-with-escapes-8                1.12GB/s ± 3%  0.87GB/s ± 0%  -22.06%  (p=0.016 n=5+4)
DecodeEntriesSkipping/single-short-line-8                      91.9MB/s ± 1%  68.3MB/s ± 0%  -25.77%  (p=0.008 n=5+5)
DecodeEntriesSkipping/single-short-line-with-escapes-8          100MB/s ± 1%    74MB/s ± 1%  -26.84%  (p=0.008 n=5+5)
DecodeEntriesSkipping/many-short-lines-8                        139MB/s ± 1%    93MB/s ± 1%  -33.05%  (p=0.008 n=5+5)
DecodeEntriesSkipping/field-key-escape-not-escapable-8          118MB/s ± 1%    83MB/s ± 5%  -29.63%  (p=0.008 n=5+5)
DecodeEntriesSkipping/tag-value-triple-escape-space-8           148MB/s ± 0%   103MB/s ± 5%  -29.97%  (p=0.008 n=5+5)
DecodeEntriesSkipping/procstat-8                                407MB/s ± 0%   230MB/s ± 4%  -43.65%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-8                      1.03GB/s ± 0%  0.96GB/s ± 4%   -7.16%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8          269MB/s ± 0%   252MB/s ± 2%   -6.42%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/single-short-line-8               85.1MB/s ± 1%  65.5MB/s ± 4%  -22.98%  (p=0.016 n=4+5)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8  87.0MB/s ± 6%  74.5MB/s ± 1%  -14.34%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/many-short-lines-8                 122MB/s ± 1%    98MB/s ± 1%  -19.93%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8   108MB/s ± 0%    90MB/s ± 0%  -17.39%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8    132MB/s ± 0%   108MB/s ± 0%  -18.35%  (p=0.008 n=5+5)
DecodeEntriesWithoutSkipping/procstat-8                         249MB/s ± 1%   162MB/s ± 0%  -35.13%  (p=0.008 n=5+5)
Encode/strict/100-points-8                                      226MB/s ±17%   225MB/s ±33%     ~     (p=0.690 n=5+5)
Encode/strict/1-point-8                                         254MB/s ± 0%   255MB/s ± 1%     ~     (p=0.310 n=5+5)
Encode/lax/100-points-8                                         338MB/s ± 1%   343MB/s ± 1%     ~     (p=0.057 n=4+4)
Encode/lax/1-point-8                                            336MB/s ± 1%   343MB/s ± 3%     ~     (p=0.095 n=5+5)

name                                                           old alloc/op   new alloc/op   delta
DecodeEntriesSkipping/long-lines-8                                 512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8           19.5kB ± 0%    19.5kB ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       512B ± 0%      512B ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/procstat-8                            512B ± 0%      512B ± 0%     ~     (all equal)
Encode/strict/100-points-8                                        109kB ±11%     110kB ±12%     ~     (p=0.881 n=5+5)
Encode/strict/1-point-8                                            979B ± 1%     1171B ± 0%  +19.61%  (p=0.008 n=5+5)
Encode/lax/100-points-8                                           111kB ± 0%     111kB ± 0%     ~     (p=0.905 n=4+5)
Encode/lax/1-point-8                                             1.12kB ± 0%    1.11kB ± 0%   -0.66%  (p=0.016 n=5+5)

name                                                           old allocs/op  new allocs/op  delta
DecodeEntriesSkipping/long-lines-8                                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/long-lines-with-escapes-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/single-short-line-8                          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/single-short-line-with-escapes-8             1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/many-short-lines-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/field-key-escape-not-escapable-8             1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/tag-value-triple-escape-space-8              1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesSkipping/procstat-8                                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-8                          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/long-lines-with-escapes-8             8.00 ± 0%      8.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/single-short-line-with-escapes-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/many-short-lines-8                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/field-key-escape-not-escapable-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/tag-value-triple-escape-space-8       1.00 ± 0%      1.00 ± 0%     ~     (all equal)
DecodeEntriesWithoutSkipping/procstat-8                            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Encode/strict/100-points-8                                         0.00           0.00          ~     (all equal)
Encode/strict/1-point-8                                            0.00           0.00          ~     (all equal)
Encode/lax/100-points-8                                            0.00           0.00          ~     (all equal)
Encode/lax/1-point-8                                               0.00           0.00          ~     (all equal)