httprb / http

HTTP (The Gem! a.k.a. http.rb) - a fast Ruby HTTP client with a chainable API, streaming support, and timeouts
MIT License
3.01k stars 321 forks source link

Avoid force encoding on frozen strings #653

Closed bvicenzo closed 3 years ago

bvicenzo commented 3 years ago

using the gem Twitter, that uses this gem, we're receiving the bellow error: force_encoding

This would have happened on this PR when the strings inside of gem have became frozen also, and the problem was solved making test strings unfrozen.

This PR proposes to make the Response class deal with frozen strings.

ixti commented 3 years ago

thanks for the PR i'm gonna merge it together with the one that changes the api to be more io alike asap

caifara commented 3 years ago

We're running into this too, any chance on a quick release?

tarcieri commented 3 years ago

Can you confirm the problem is fixed? If so, I can cut a release

caifara commented 3 years ago

That's really great! I'm quite ashamed to first ask a fast release only to see the frozen version in our dependencies to be quite a bit older. That said, the mentioned error and changes in this PR point to the exact same problem and the source doesn't seem to have changed much on that point.

Another problem is that we see the error in production sporadically, but not in development. That means there's no straightforward way to be sure the problem is really fixed.

And yet another problem is that I don't really know why the solution proposed in this PR is really helping. Where is the bug in the original code? I can only trigger the bug locally by forcing @contents.freeze in the while loop. As you don't run a release immediately I'm guessing you too aren't entirely sure?

tarcieri commented 3 years ago

Correct, I'm not sure either, and this looks like it will allocate two strings for every read, which will increase GC pressure. In that regard it's probably not the greatest solution.

tarcieri commented 3 years ago

A better strategy is probably:

  1. Check if string matches @encoding. If it does, we're done.
  2. Check if string is frozen. If it isn't, use #force_encoding(@encoding)
  3. If the string is frozen, allocate a new string with the given encoding
cooljeanius commented 2 years ago

This caused sferik/t#452; how do I update its dependency on this to include the fix? I tried something in sferik/t#462 but I'm not sure if it's right... (I don't actually speak ruby)

tarcieri commented 2 years ago

It was released in v5.0.2 in August of last year.

You need to update your Gemfile.lock, e.g. bundle update

janko commented 10 months ago

Correct, I'm not sure either, and this looks like it will allocate two strings for every read, which will increase GC pressure. In that regard it's probably not the greatest solution.

Yeah, the benefit of the previous String#clear implementation was that it deallocates chunks immediately, instead of having to wait for the GC to collect them. So, after this change, HTTP::Response::Body#to_s allocates 2x more memory than it did previously for responses with large bodies (e.g. images).

We have a background job where we need to download many images in a row, and it's currently getting OOM-killed. Deallocating content immediately after its no longer would likely help prevent the memory usage from growing too rapidly, allowing time for GC. I wanted to use Body#to_s for downloading images, but now image chunks will remain in memory that I cannot String#clear, so I'll need to use Body#readpartial instead.