cbeer / solr_wrapper

Wrap your tests with Solr 5+
MIT License
23 stars 20 forks source link

"body has already been consumed" #140

Closed jrochkind closed 3 years ago

jrochkind commented 3 years ago

Today solr_wrapper started failing for me trying to download solr.

(Not sure when the last time I downloaded solr was, since I normally cache it, it's possible my caches were just busted today. So not sure when last known good working was).

I am curious if anyone else is running into this. It happens on my local machine, not just CI, and when pointing at princeton mirror, so it's not a rate-limiting issue.

          SolrWrapper::SolrWrapperError:
            Unable to download solr from http://lib-solr-mirror.princeton.edu/dist/lucene/solr/8.6.3/solr-8.6.3.zip
            body has already been consumed

solr_wrapper: 3.1.2 http gem: reproduces with http 5.0.1 or 4.4.1. At first I thought it might be caused by upgrading to http 5, but does not seem to be.

Anyone else running into this? I am trying to debug, but having trouble figuring out what's going on.

jrochkind commented 3 years ago

This is very odd. If I add debugging stuff, right after this line:

https://github.com/cbeer/solr_wrapper/blob/v3.1.2/lib/solr_wrapper/downloader.rb#L9

If I do response.body.stream!, I get a raise: HTTP::StateError Exception: body has already been consumed

But if in my own code separately I do a response = HTTP.follow.get(same_url), then subsequently response.body.stream! returns true with no raise.

Somehow in the context of solr_wrapper it's consuming all the response body and making it no longer streamable or something, I can't figure out why, my debugging seems to defy the laws of ruby.

jrochkind commented 3 years ago

Also in the context of debugging, the HTTP.follow.get(url) is very slow, it definitely is streaming everything... this could be an artifact of my debugging (debugger console triggering a stream), rather than the actual underlying issue, hmm. Although still having trouble reproducing in a separate console.

jrochkind commented 3 years ago

Aha. It's got something to do with webmock.

Webmock (3.13) is patching HTTP::Client.perform: https://github.com/bblimke/webmock/blob/b709ba22a2949dc3bfac662f3f4da88a21679c2e/lib/webmock/http_lib_adapters/http_rb/client.rb

Something about how it patches ends up calling response.body.to_s, which streams the whole response, and makes streaming impossible.

Things I still don't understand:

From a byebug inside SolrWrapper::Downloader:

HTTP::Client.instance_method(:perform).source_location
["/Users/jrochkind/.gem/ruby/2.6.6/gems/webmock-3.13.0/lib/webmock/http_lib_adapters/http_rb/client.rb", 5]

From a console in the same app:

HTTP::Client.instance_method(:perform).source_location
=> ["/Users/jrochkind/.gem/ruby/2.6.6/gems/http-4.4.1/lib/http/client.rb", 66]
jrochkind commented 3 years ago

Oho, I notice someone here had before noticed Webmock messing with http-rb streaming:

https://github.com/cbeer/solr_wrapper/blob/master/spec/lib/solr_wrapper/instance_spec.rb#L4

So why is it coming into play when I'm not running tests though? Ah, but of course I am running the tests, this is happening in CI and test running when trying to use solr_wrapper to download solr!

But why did this only start recently for me, what changed in my app?

If I put a WebMock.disable! into the relevant SolrBuilder code, it works again -- but why this necessary, and does that make sense as a PR?

jrochkind commented 3 years ago

OK. Yeah, you just really can't try to have SolrBuilder download a Solr if WebMock is active. (https://github.com/httprb/http/issues/212 ?) Possibly the README should warn hard about that, but i guess you can't warn on everything, I don't know.

I'm not sure what I changed in my app to trigger this problem recently. But I am doing some "odd" things with SolrBuilder, maybe.

I worked around it like this:

https://github.com/sciencehistory/scihist_digicoll/pull/1278