Closed jrochkind closed 7 years ago
Lists of mirrors:
http://www.apache.org/dyn/closer.lua/lucene/solr/
http://www.apache.org/dyn/closer.lua/lucene/solr/7.1.0
Princeton's mirror, that they set up for the purpose of samvera use, which is not actually listed in the master mirror list I think:
solr_wrapper
already downloads from a mirror, and uses Apache's own mirror selection script (https://github.com/cbeer/solr_wrapper/blob/818020b44c15620b9bf515755f2cf7dc9af37105/lib/solr_wrapper/configuration.rb#L92) to do so.
ah, great. So are we no longer having problems with rate limiting?
@jrochkind the rate limiting was our request to their svn server to see what the most recent release was, it was not limiting on the download of solr itself.
Ah, thanks for clarification. I think there has been some confusion about this in the community.
Is that problem solved, or does it still need a solution? I wonder if there's a non-svn non-rate-limited means the software could automatically find the latest version, which isn't screen-scraping? Maven?
I think the REAL problem is that apache offers no machine readable way for us to query "What is the most recent released version of Solr"... Chris most recently patched this by screen scraping their /index.html
page
Could use of maven repos be a machine-readable way to find out most recent released version of solr? I tried looking into that, but wow, maven is so much messier than rubygems/bundler.
Do we only need the latest version, or do we need all versions (or the latest within a spec like "5.x")?
If only the latest latest, i might have found a hacky way.
We need the latest released version. What we're doing now works fine, so until we have a reported problem again, I think it's good enough.
One problem with going to e.g. maven is there's often a lag between the Solr people cutting a new release and making a release announcement / updating docs / etc, so I think using their website (like we are now) is probably the best option.
Solr product/apache does not want us downloading from the main apache site, they want people to use mirrors. I think it could possibly work fine hard-coded to a single mirror that does not have rate limits we know about (such as princetons), or more complicatedly an auto-discovery and random selection of a mirror, which is also possible.
Screen-scraping apache.org is not a polite solution to the rate limiting.