cbeer / solr_wrapper

Wrap your tests with Solr 5+
MIT License
23 stars 20 forks source link

should download solr.zip from a mirror #112

Closed jrochkind closed 6 years ago

jrochkind commented 6 years ago

Solr product/apache does not want us downloading from the main apache site, they want people to use mirrors. I think it could possibly work fine hard-coded to a single mirror that does not have rate limits we know about (such as princetons), or more complicatedly an auto-discovery and random selection of a mirror, which is also possible.

Screen-scraping apache.org is not a polite solution to the rate limiting.

jrochkind commented 6 years ago

Lists of mirrors:

http://www.apache.org/dyn/closer.lua/lucene/solr/

http://www.apache.org/dyn/closer.lua/lucene/solr/7.1.0

Princeton's mirror, that they set up for the purpose of samvera use, which is not actually listed in the master mirror list I think:

http://lib-solr-mirror.princeton.edu/dist/lucene/solr/

cbeer commented 6 years ago

solr_wrapper already downloads from a mirror, and uses Apache's own mirror selection script (https://github.com/cbeer/solr_wrapper/blob/818020b44c15620b9bf515755f2cf7dc9af37105/lib/solr_wrapper/configuration.rb#L92) to do so.

jrochkind commented 6 years ago

ah, great. So are we no longer having problems with rate limiting?

jcoyne commented 6 years ago

@jrochkind the rate limiting was our request to their svn server to see what the most recent release was, it was not limiting on the download of solr itself.

jrochkind commented 6 years ago

Ah, thanks for clarification. I think there has been some confusion about this in the community.

Is that problem solved, or does it still need a solution? I wonder if there's a non-svn non-rate-limited means the software could automatically find the latest version, which isn't screen-scraping? Maven?

jcoyne commented 6 years ago

I think the REAL problem is that apache offers no machine readable way for us to query "What is the most recent released version of Solr"... Chris most recently patched this by screen scraping their /index.html page

jrochkind commented 6 years ago

Could use of maven repos be a machine-readable way to find out most recent released version of solr? I tried looking into that, but wow, maven is so much messier than rubygems/bundler.

Do we only need the latest version, or do we need all versions (or the latest within a spec like "5.x")?

If only the latest latest, i might have found a hacky way.

cbeer commented 6 years ago

We need the latest released version. What we're doing now works fine, so until we have a reported problem again, I think it's good enough.

One problem with going to e.g. maven is there's often a lag between the Solr people cutting a new release and making a release announcement / updating docs / etc, so I think using their website (like we are now) is probably the best option.