tinylib / msgp

A Go code generator for MessagePack / msgpack.org[Go]
MIT License
1.81k stars 192 forks source link

Use non-pointer receiver for Marshal and Size #155

Closed hectorj closed 8 years ago

hectorj commented 8 years ago

I see from this part of the code that the generator prefer pointer receivers for struct with > 3 fields and arrays.

This seems unnecessary (this methods do not modify the data they operate on, so they do not require a pointer) and is inconvenient (I'd like to be able to Marshal my structs without referencing them, which possibly increases the generated garbage).

Folks at Easyjson accepted my PR after checking benchmarks. Would you accept something similar for the msgp generator?.

Or is there some benchmark showing that the use of a pointer receiver actually improves performances?

philhofer commented 8 years ago

This has been tried before, and demonstrated worse performance. See :https://github.com/tinylib/msgp/pull/72

You shouldn't have to address your locals to call a method; in the expression a.foo(), foo can take a as a pointer receiver. Conversely, if you're addressing the struct in order to cast it to an interface, it's already being boxed, so it will live on the heap regardless of the method receiver. In any case, escape analysis should rarely (perhaps never?) conclude that the MarshalMsg or Size methods cause the receiver to escape.

hectorj commented 8 years ago

Your point about escape analysis seems right. So it is mostly about convenience: at some place in my code I take an i interface{}, and later have a type switch checking if i is msgp.Encodable.

With pointer receivers if I pass just my struct, it is not. I have to pass a pointer to my struct, which doesn't fit my needs.

About performances, I see @zond said

Never mind, I just made it all work with my branch, and my first trivial timing of the new performance showed worse performance.

But running the benchmarks in master, performances do not seem to be affected. (I only get some small variance going both ways, which was expected):

$ go version
go version go1.5.3 linux/amd64
# master
$ git rev-parse HEAD
cf4d6d402b01d9b359f52fc88be0f582402177c0
$ go install ./...
$ go generate ./...
======== MessagePack Code Generator =======
>>> Input: "defs_test.go"
>>> Wrote and formatted "defgen_test.go"
$ go test -v -cpu=2 ./... -bench .
# [All tests pass]
PASS
BenchmarkLocate-2               20000000            97.8 ns/op   531.83 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            80.0 ns/op
BenchmarkReadWriteFloat64-2     20000000            82.3 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1698 ns/op      93.59 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          1989 ns/op      79.92 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          5783 ns/op      29.40 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            8.32 ns/op  360.43 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            7.61 ns/op  394.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          1000000000           2.72 ns/op  367.11 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.20 ns/op  978.47 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.27 ns/op  948.08 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.24 ns/op  160.23 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.3 ns/op   920.73 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           164 ns/op     908.50 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           20.9 ns/op    95.71 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.6 ns/op    96.94 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           16.9 ns/op    59.09 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            25.6 ns/op   351.97 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          100000000           22.2 ns/op   225.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            100000000           24.8 ns/op   161.34 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           50000000            24.8 ns/op    80.63 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          30000000            40.1 ns/op   448.85 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         20000000           100 ns/op    2574.03 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         3000000           510 ns/op    4014.73 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            40.2 ns/op   422.87 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000           107 ns/op    2418.58 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           127 ns/op     133.53 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           266 ns/op     971.61 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            31.5 ns/op   317.30 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       50000000            38.7 ns/op   464.71 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             30000000            38.9 ns/op   385.45 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  5000000           413 ns/op     360.76 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      200000000            8.02 ns/op        0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    200000000            7.92 ns/op        0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           12.2 ns/op   736.77 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        100000000           10.4 ns/op   479.46 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          100000000           18.6 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         100000000           19.3 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           18.2 ns/op  1156.18 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.2 ns/op  9584.15 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      10000000           109 ns/op    18743.55 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           16.7 ns/op  1254.41 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            28.5 ns/op  9154.87 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            81.2 ns/op  25293.70 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            3.60 ns/op  277.61 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            24.2 ns/op   620.99 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            8.64 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            8.97 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           14.4 ns/op   624.26 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           12.2 ns/op   408.34 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           13.8 ns/op   652.07 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           13.9 ns/op   648.52 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         50000000            23.5 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            33.1 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            86.8 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            24.6 ns/op   610.03 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           766 ns/op     105.71 MB/s
ok      github.com/tinylib/msgp/msgp    374.252s
# PR 156
$ git checkout avoid-pointers-receivers 
Switched to branch 'avoid-pointers-receivers'
Your branch is up-to-date with 'fork/avoid-pointers-receivers'.
$ git rev-parse HEAD
4416ec38a88dcd4b55b36ff34d92950d684edc1f
$ go install ./...
$ go generate ./...
======== MessagePack Code Generator =======
>>> Input: "defs_test.go"
>>> Wrote and formatted "defgen_test.go"
$ go test -v -cpu=2 ./... -bench .
# [All tests pass]
PASS
BenchmarkLocate-2               20000000            97.9 ns/op   531.23 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            85.8 ns/op
BenchmarkReadWriteFloat64-2     20000000            81.5 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1891 ns/op      84.05 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          2206 ns/op      72.05 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          7715 ns/op      22.03 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            8.41 ns/op  356.77 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            7.77 ns/op  386.24 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          500000000            2.95 ns/op  339.48 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.36 ns/op  961.76 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.46 ns/op  916.55 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.66 ns/op  150.10 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.1 ns/op   930.48 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           171 ns/op     868.05 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           20.3 ns/op    98.31 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.6 ns/op    96.97 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           16.6 ns/op    60.16 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            27.5 ns/op   327.17 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          50000000            24.0 ns/op   208.24 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            50000000            35.9 ns/op   111.41 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           50000000            27.7 ns/op    72.26 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          30000000            49.9 ns/op   360.56 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         10000000           137 ns/op    1878.40 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         2000000           547 ns/op    3743.05 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            46.2 ns/op   368.24 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 10000000           124 ns/op    2075.45 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           110 ns/op     154.20 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           256 ns/op    1010.92 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            37.3 ns/op   267.92 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       30000000            49.4 ns/op   364.74 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             50000000            40.7 ns/op   368.44 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  3000000           398 ns/op     374.23 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      200000000            7.87 ns/op        0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    200000000            7.82 ns/op        0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           11.7 ns/op   768.19 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        200000000            9.69 ns/op  515.96 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          100000000           20.2 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         100000000           18.9 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           19.2 ns/op  1093.52 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.5 ns/op  9494.18 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      20000000           109 ns/op    18832.78 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           16.5 ns/op  1275.48 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            26.6 ns/op  9803.78 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            84.7 ns/op  24227.11 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            3.59 ns/op  278.51 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            23.9 ns/op   626.37 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            8.52 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            8.88 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           13.9 ns/op   645.32 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           11.3 ns/op   443.03 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           14.4 ns/op   624.81 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           13.1 ns/op   686.05 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         100000000           22.0 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            31.8 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            84.9 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            28.0 ns/op   535.39 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           730 ns/op     110.88 MB/s
ok      github.com/tinylib/msgp/msgp    300.308s
hectorj commented 8 years ago

(To show that the small differences in my 2 benchmarks are just variance, I did a second run with #156 : )

PASS
BenchmarkLocate-2               20000000            97.2 ns/op   534.81 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            80.6 ns/op
BenchmarkReadWriteFloat64-2     20000000            81.2 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1702 ns/op      93.37 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          2000 ns/op      79.47 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          5663 ns/op      30.02 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            7.80 ns/op  384.40 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            8.34 ns/op  359.69 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          1000000000           2.84 ns/op  352.63 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.26 ns/op  971.95 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.34 ns/op  936.68 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.70 ns/op  149.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.5 ns/op   908.66 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           168 ns/op     884.07 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           21.2 ns/op    94.43 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.8 ns/op    96.05 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           17.1 ns/op    58.40 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            35.4 ns/op   254.51 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          50000000            35.3 ns/op   141.69 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            50000000            23.2 ns/op   172.10 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           100000000           25.0 ns/op    80.02 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          20000000            52.1 ns/op   345.16 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         20000000           113 ns/op    2271.98 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         3000000           468 ns/op    4379.13 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            46.2 ns/op   367.59 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000           135 ns/op    1913.54 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           119 ns/op     142.13 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           312 ns/op     827.54 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            37.4 ns/op   267.07 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       50000000            42.5 ns/op   423.25 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             30000000            41.4 ns/op   362.65 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  3000000           435 ns/op     341.87 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      100000000           11.0 ns/op         0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    100000000           11.7 ns/op         0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           19.0 ns/op   472.99 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        100000000           15.4 ns/op   325.56 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          50000000            33.7 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         50000000            29.4 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           21.1 ns/op   993.56 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.7 ns/op  9420.12 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      20000000           110 ns/op    18523.14 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           25.3 ns/op   829.57 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            33.1 ns/op  7886.48 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            98.0 ns/op  20952.67 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            4.31 ns/op  231.91 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            26.1 ns/op   574.90 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            9.63 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            9.30 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           15.2 ns/op   593.26 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           11.7 ns/op   427.58 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           15.0 ns/op   601.99 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           14.6 ns/op   615.61 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         100000000           23.5 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            36.7 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            92.0 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            29.4 ns/op   509.40 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           822 ns/op      98.48 MB/s
philhofer commented 8 years ago

You need to benchmark the code in ./_generated explicitly; the go tool will ignore it otherwise. All of those benchmarks are for library support code, not generated methods.

On Wed, May 11, 2016 at 9:24 PM, Hector Jusforgues <notifications@github.com

wrote:

(To show that the small differences in my 2 benchmarks are just variance, I did a second run with #156 https://github.com/tinylib/msgp/pull/156 :

PASS BenchmarkLocate-2 20000000 97.2 ns/op 534.81 MB/s 0 B/op 0 allocs/op BenchmarkReadWriteFloat32-2 20000000 80.6 ns/op BenchmarkReadWriteFloat64-2 20000000 81.2 ns/op BenchmarkUnmarshalAsJSON-2 1000000 1702 ns/op 93.37 MB/s 16 B/op 1 allocs/op BenchmarkCopyToJSON-2 1000000 2000 ns/op 79.47 MB/s 48 B/op 1 allocs/op BenchmarkStdlibJSON-2 200000 5663 ns/op 30.02 MB/s 920 B/op 36 allocs/op BenchmarkReadMapHeaderBytes-2 200000000 7.80 ns/op 384.40 MB/s 0 B/op 0 allocs/op BenchmarkReadArrayHeaderBytes-2 200000000 8.34 ns/op 359.69 MB/s 0 B/op 0 allocs/op BenchmarkReadNilByte-2 1000000000 2.84 ns/op 352.63 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat64Bytes-2 200000000 9.26 ns/op 971.95 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat32Bytes-2 300000000 5.34 ns/op 936.68 MB/s 0 B/op 0 allocs/op BenchmarkReadBoolBytes-2 200000000 6.70 ns/op 149.27 MB/s 0 B/op 0 allocs/op BenchmarkReadTimeBytes-2 100000000 16.5 ns/op 908.66 MB/s 0 B/op 0 allocs/op BenchmarkSkipBytes-2 10000000 168 ns/op 884.07 MB/s 0 B/op 0 allocs/op BenchmarkReadMapHeader-2 100000000 21.2 ns/op 94.43 MB/s 0 B/op 0 allocs/op BenchmarkReadArrayHeader-2 100000000 20.8 ns/op 96.05 MB/s 0 B/op 0 allocs/op BenchmarkReadNil-2 100000000 17.1 ns/op 58.40 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat64-2 50000000 35.4 ns/op 254.51 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat32-2 50000000 35.3 ns/op 141.69 MB/s 0 B/op 0 allocs/op BenchmarkReadInt64-2 50000000 23.2 ns/op 172.10 MB/s 0 B/op 0 allocs/op BenchmarkReadUint64-2 100000000 25.0 ns/op 80.02 MB/s 0 B/op 0 allocs/op BenchmarkRead16Bytes-2 20000000 52.1 ns/op 345.16 MB/s 0 B/op 0 allocs/op BenchmarkRead256Bytes-2 20000000 113 ns/op 2271.98 MB/s 0 B/op 0 allocs/op BenchmarkRead2048Bytes-2 3000000 468 ns/op 4379.13 MB/s 0 B/op 0 allocs/op BenchmarkRead16StringAsBytes-2 30000000 46.2 ns/op 367.59 MB/s 0 B/op 0 allocs/op BenchmarkRead256StringAsBytes-2 20000000 135 ns/op 1913.54 MB/s 0 B/op 0 allocs/op BenchmarkRead16String-2 10000000 119 ns/op 142.13 MB/s 16 B/op 1 allocs/op BenchmarkRead256String-2 5000000 312 ns/op 827.54 MB/s 256 B/op 1 allocs/op BenchmarkReadComplex64-2 50000000 37.4 ns/op 267.07 MB/s 0 B/op 0 allocs/op BenchmarkReadComplex128-2 50000000 42.5 ns/op 423.25 MB/s 0 B/op 0 allocs/op BenchmarkReadTime-2 30000000 41.4 ns/op 362.65 MB/s 0 B/op 0 allocs/op BenchmarkSkip-2 3000000 435 ns/op 341.87 MB/s 0 B/op 0 allocs/op BenchmarkAppendMapHeader-2 100000000 11.0 ns/op 0 B/op 0 allocs/op BenchmarkAppendArrayHeader-2 100000000 11.7 ns/op 0 B/op 0 allocs/op BenchmarkAppendFloat64-2 100000000 19.0 ns/op 472.99 MB/s 0 B/op 0 allocs/op BenchmarkAppendFloat32-2 100000000 15.4 ns/op 325.56 MB/s 0 B/op 0 allocs/op BenchmarkAppendInt64-2 50000000 33.7 ns/op 0 B/op 0 allocs/op BenchmarkAppendUint64-2 50000000 29.4 ns/op 0 B/op 0 allocs/op BenchmarkAppend16Bytes-2 100000000 21.1 ns/op 993.56 MB/s 0 B/op 0 allocs/op BenchmarkAppend256Bytes-2 50000000 27.7 ns/op 9420.12 MB/s 0 B/op 0 allocs/op BenchmarkAppend2048Bytes-2 20000000 110 ns/op 18523.14 MB/s 0 B/op 0 allocs/op BenchmarkAppend16String-2 100000000 25.3 ns/op 829.57 MB/s 0 B/op 0 allocs/op BenchmarkAppend256String-2 50000000 33.1 ns/op 7886.48 MB/s 0 B/op 0 allocs/op BenchmarkAppend2048String-2 20000000 98.0 ns/op 20952.67 MB/s 0 B/op 0 allocs/op BenchmarkAppendBool-2 300000000 4.31 ns/op 231.91 MB/s 0 B/op 0 allocs/op BenchmarkAppendTime-2 50000000 26.1 ns/op 574.90 MB/s 0 B/op 0 allocs/op BenchmarkWriteMapHeader-2 200000000 9.63 ns/op 0 B/op 0 allocs/op BenchmarkWriteArrayHeader-2 200000000 9.30 ns/op 0 B/op 0 allocs/op BenchmarkWriteFloat64-2 100000000 15.2 ns/op 593.26 MB/s 0 B/op 0 allocs/op BenchmarkWriteFloat32-2 100000000 11.7 ns/op 427.58 MB/s 0 B/op 0 allocs/op BenchmarkWriteInt64-2 100000000 15.0 ns/op 601.99 MB/s 0 B/op 0 allocs/op BenchmarkWriteUint64-2 100000000 14.6 ns/op 615.61 MB/s 0 B/op 0 allocs/op BenchmarkWrite16Bytes-2 100000000 23.5 ns/op 0 B/op 0 allocs/op BenchmarkWrite256Bytes-2 50000000 36.7 ns/op 0 B/op 0 allocs/op BenchmarkWrite2048Bytes-2 20000000 92.0 ns/op 0 B/op 0 allocs/op BenchmarkWriteTime-2 50000000 29.4 ns/op 509.40 MB/s 0 B/op 0 allocs/op BenchmarkWriteReadFile-2 2000000 822 ns/op 98.48 MB/s

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tinylib/msgp/issues/155#issuecomment-218657054

philhofer commented 8 years ago

You'll also take a perf hit when you turn a value into interface{} because it will have to be both copied and boxed.

On Wed, May 11, 2016 at 9:42 PM, Philip Hofer phofer@umich.edu wrote:

You need to benchmark the code in ./_generated explicitly; the go tool will ignore it otherwise. All of those benchmarks are for library support code, not generated methods.

On Wed, May 11, 2016 at 9:24 PM, Hector Jusforgues < notifications@github.com> wrote:

(To show that the small differences in my 2 benchmarks are just variance, I did a second run with #156 https://github.com/tinylib/msgp/pull/156 :

PASS BenchmarkLocate-2 20000000 97.2 ns/op 534.81 MB/s 0 B/op 0 allocs/op BenchmarkReadWriteFloat32-2 20000000 80.6 ns/op BenchmarkReadWriteFloat64-2 20000000 81.2 ns/op BenchmarkUnmarshalAsJSON-2 1000000 1702 ns/op 93.37 MB/s 16 B/op 1 allocs/op BenchmarkCopyToJSON-2 1000000 2000 ns/op 79.47 MB/s 48 B/op 1 allocs/op BenchmarkStdlibJSON-2 200000 5663 ns/op 30.02 MB/s 920 B/op 36 allocs/op BenchmarkReadMapHeaderBytes-2 200000000 7.80 ns/op 384.40 MB/s 0 B/op 0 allocs/op BenchmarkReadArrayHeaderBytes-2 200000000 8.34 ns/op 359.69 MB/s 0 B/op 0 allocs/op BenchmarkReadNilByte-2 1000000000 2.84 ns/op 352.63 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat64Bytes-2 200000000 9.26 ns/op 971.95 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat32Bytes-2 300000000 5.34 ns/op 936.68 MB/s 0 B/op 0 allocs/op BenchmarkReadBoolBytes-2 200000000 6.70 ns/op 149.27 MB/s 0 B/op 0 allocs/op BenchmarkReadTimeBytes-2 100000000 16.5 ns/op 908.66 MB/s 0 B/op 0 allocs/op BenchmarkSkipBytes-2 10000000 168 ns/op 884.07 MB/s 0 B/op 0 allocs/op BenchmarkReadMapHeader-2 100000000 21.2 ns/op 94.43 MB/s 0 B/op 0 allocs/op BenchmarkReadArrayHeader-2 100000000 20.8 ns/op 96.05 MB/s 0 B/op 0 allocs/op BenchmarkReadNil-2 100000000 17.1 ns/op 58.40 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat64-2 50000000 35.4 ns/op 254.51 MB/s 0 B/op 0 allocs/op BenchmarkReadFloat32-2 50000000 35.3 ns/op 141.69 MB/s 0 B/op 0 allocs/op BenchmarkReadInt64-2 50000000 23.2 ns/op 172.10 MB/s 0 B/op 0 allocs/op BenchmarkReadUint64-2 100000000 25.0 ns/op 80.02 MB/s 0 B/op 0 allocs/op BenchmarkRead16Bytes-2 20000000 52.1 ns/op 345.16 MB/s 0 B/op 0 allocs/op BenchmarkRead256Bytes-2 20000000 113 ns/op 2271.98 MB/s 0 B/op 0 allocs/op BenchmarkRead2048Bytes-2 3000000 468 ns/op 4379.13 MB/s 0 B/op 0 allocs/op BenchmarkRead16StringAsBytes-2 30000000 46.2 ns/op 367.59 MB/s 0 B/op 0 allocs/op BenchmarkRead256StringAsBytes-2 20000000 135 ns/op 1913.54 MB/s 0 B/op 0 allocs/op BenchmarkRead16String-2 10000000 119 ns/op 142.13 MB/s 16 B/op 1 allocs/op BenchmarkRead256String-2 5000000 312 ns/op 827.54 MB/s 256 B/op 1 allocs/op BenchmarkReadComplex64-2 50000000 37.4 ns/op 267.07 MB/s 0 B/op 0 allocs/op BenchmarkReadComplex128-2 50000000 42.5 ns/op 423.25 MB/s 0 B/op 0 allocs/op BenchmarkReadTime-2 30000000 41.4 ns/op 362.65 MB/s 0 B/op 0 allocs/op BenchmarkSkip-2 3000000 435 ns/op 341.87 MB/s 0 B/op 0 allocs/op BenchmarkAppendMapHeader-2 100000000 11.0 ns/op 0 B/op 0 allocs/op BenchmarkAppendArrayHeader-2 100000000 11.7 ns/op 0 B/op 0 allocs/op BenchmarkAppendFloat64-2 100000000 19.0 ns/op 472.99 MB/s 0 B/op 0 allocs/op BenchmarkAppendFloat32-2 100000000 15.4 ns/op 325.56 MB/s 0 B/op 0 allocs/op BenchmarkAppendInt64-2 50000000 33.7 ns/op 0 B/op 0 allocs/op BenchmarkAppendUint64-2 50000000 29.4 ns/op 0 B/op 0 allocs/op BenchmarkAppend16Bytes-2 100000000 21.1 ns/op 993.56 MB/s 0 B/op 0 allocs/op BenchmarkAppend256Bytes-2 50000000 27.7 ns/op 9420.12 MB/s 0 B/op 0 allocs/op BenchmarkAppend2048Bytes-2 20000000 110 ns/op 18523.14 MB/s 0 B/op 0 allocs/op BenchmarkAppend16String-2 100000000 25.3 ns/op 829.57 MB/s 0 B/op 0 allocs/op BenchmarkAppend256String-2 50000000 33.1 ns/op 7886.48 MB/s 0 B/op 0 allocs/op BenchmarkAppend2048String-2 20000000 98.0 ns/op 20952.67 MB/s 0 B/op 0 allocs/op BenchmarkAppendBool-2 300000000 4.31 ns/op 231.91 MB/s 0 B/op 0 allocs/op BenchmarkAppendTime-2 50000000 26.1 ns/op 574.90 MB/s 0 B/op 0 allocs/op BenchmarkWriteMapHeader-2 200000000 9.63 ns/op 0 B/op 0 allocs/op BenchmarkWriteArrayHeader-2 200000000 9.30 ns/op 0 B/op 0 allocs/op BenchmarkWriteFloat64-2 100000000 15.2 ns/op 593.26 MB/s 0 B/op 0 allocs/op BenchmarkWriteFloat32-2 100000000 11.7 ns/op 427.58 MB/s 0 B/op 0 allocs/op BenchmarkWriteInt64-2 100000000 15.0 ns/op 601.99 MB/s 0 B/op 0 allocs/op BenchmarkWriteUint64-2 100000000 14.6 ns/op 615.61 MB/s 0 B/op 0 allocs/op BenchmarkWrite16Bytes-2 100000000 23.5 ns/op 0 B/op 0 allocs/op BenchmarkWrite256Bytes-2 50000000 36.7 ns/op 0 B/op 0 allocs/op BenchmarkWrite2048Bytes-2 20000000 92.0 ns/op 0 B/op 0 allocs/op BenchmarkWriteTime-2 50000000 29.4 ns/op 509.40 MB/s 0 B/op 0 allocs/op BenchmarkWriteReadFile-2 2000000 822 ns/op 98.48 MB/s

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tinylib/msgp/issues/155#issuecomment-218657054

hectorj commented 8 years ago

Ok, the benchmarks in _generated/ indeed show a difference. One more allocation for:

I'll see if I can improve that

zond commented 8 years ago

To be clear, my benchmark was of my own code, using my fork of msgp.

I was unable to compare my code with the fork vs my code with mainline msgp since I was unable to get my code working with mainline.

I didn't think to benchmark msgp in the fork on its own vs mainline.

hectorj commented 8 years ago

That's making me wonder if I missed something with https://github.com/mailru/easyjson/pull/15 or if some difference in the implementations makes it efficient with easyjson but not with msgp...

Gotta do some digging

zond commented 8 years ago

I'm not sure I understand what you mean, but just to make sure I'll clarify even more :)

I benchmarked my own code using https://github.com/vmihailenco/msgpack vs my own code using a fork of msgp that just added shims to some more types.

This benchmark showed the unreasonable result that the code became slower with msgp.

This was unreasonable because msgpack used reflection, while msgp uses generated hard coded coders.

This made me give up and forget all about it.

TL;DR

I don't believe msgp is slower than msgpack, and I don't necessarily think more indirection via pointers or shims in msgp would make things relevantly slower.

hectorj commented 8 years ago

@zond: oh, thanks for the clarification. To clarify too, my last comment was not about your observations but about the results I get from my benchmarks run.

The non-pointer-receiver way is less efficient for msgp, but it did not seem to be for easyjson (which does something similar to msgp, just for faster json Marshaling/Unmarshaling)

zond commented 8 years ago

@hectorj Ah, thanks!

Weird, please explain what caused it if you find out :)

hectorj commented 8 years ago

Closing for now, as I haven't been able to produce code with the same features & performances and non-pointers receivers for now, and I don't have enough time to keep trying.

Thanks all for your inputs.