francoispqt / gojay

high performance JSON encoder/decoder with stream API for Golang
MIT License
2.11k stars 112 forks source link

When reusing the decoder, gojay does not flush the buffer #108

Open xaionaro opened 5 years ago

xaionaro commented 5 years ago

Hello. I've tried to replace the standard JSON decoder to gojay's. The result:

Here's the code:

https://github.com/trafficstars/put2ch/blob/79f2e1d100bbd181df3cb54e8199995d5f868ded/input_raw_json.go

francoispqt commented 5 years ago

Gojay is used in production on a highly instrumented system handling thousands of requests per second. In the context it's being used it does not leak memory, if it would, we would have noticed. It could be linked with something you're doing specifically. Let me try to reproduce what you are noticing.

francoispqt commented 5 years ago

So the reason is you are reusing the same decoder. It's actually a duplicate of this issue: #79

At the time I suggested to add a reset method. But the OP implemented his own solution. As I want gojay to stick more to the behaviour of the standard API, I will implement the reset after each decode so that the buffer doesn't grow forever if you reuse the Decoder, in the meantime you can get a new decoder between each loop and test.

xaionaro commented 5 years ago

@francoispqt I cannot get new decoder because I have a byte stream of JSON messages and on dropping the old decoder I will crop a part of the next message (I've already tried that before creating the issue).

francoispqt commented 5 years ago

Yes, of course, because it reads the bytes available in the reader into the decoder's internal buffer.

So basically, what we'd need to do, is everytime an item is read, drop the bytes already read by resetting the cursor, this is actually what the stream api already does. But you're not using the stream api. I will implement that in the regular API and evaluate the perf impact.

xaionaro commented 5 years ago

But you're not using the stream api. I will implement that in the regular API and evaluate the perf impact.

In my quick-research I did not understand how to use the stream API for my case. If that is the API that I was supposed to use for my case then don't waste time on me: I will just try to use it.

So am I understand correctly that I just should use the stream API? :)

francoispqt commented 5 years ago

Yes, the best for you would be to use the stream api, it will basically push messages to a channel whenever it's there, so you just have to read from that channel.

But if you want, I have just implemented the reset to reuse the Decoder because I do think there is a usecase for that and I want it to be consistent with the standard json. I have tested and it works correctly with no impact on performance, I've added a test as well. You can try it in the feature/reset-decoder branch.

If it's ok for you, I will merge it to master and tag a new release tomorrow.

xaionaro commented 5 years ago

I will test it tomorrow, thank you very much :)

xaionaro commented 5 years ago

Well, I've switched the branch and it did not help.

gojay# git log | head -2
commit 1cb50f7b876bd5abaa4ffa66c9e28cbb0c3547fe
Author: francoispqt <francois@parquet.ninja>
# grep RSS /proc/`pgrep logger`/status
VmRSS:   3023848 kB                                                                                  

I will try to use the steam API. Anyway, thanks :)

francoispqt commented 5 years ago

Are you running it in a unit test (mocking your udp connection) or you're really testing your stream?

xaionaro commented 5 years ago

Are you running it in a unit test (mocking your udp connection) or you're really testing your stream?

The second one: I have a testing environment with real traffic. So I just run my application and watch how it works.

$ go version
go version go1.12 linux/amd64
francoispqt commented 5 years ago

Ok. Let me setup an environment quickly. It shouldn't grow like that. I know it doesn't with the stream api, as we are using it, and you can try the example in examples/websocket, just run go run main.go and you'll see it doesn't grow.

francoispqt commented 5 years ago

Ok, so I have pushed updates to the branch feature/reset-decoder.

I've added an example which replicates what you are doing, you can find it here: https://github.com/francoispqt/gojay/blob/9df89bce7ea024f84eb4d44efb460bbf755e7b58/examples/reuse-decoder/main.go

You can run the example doing: go run main.go The memory usage does not grow. Please let me know if it works in your case.