Closed philbudne closed 9 months ago
NPR seems to use Akamai GHost (specifically bot manager), so my guess is that making small changes to the UA string will only be a temporary fix. Trying to continuously fight Akamai's detection negates the original goal of honesty and being good internet citizens, so in my opinion we should reach out to whoever handles NPR's network security to see if the UMass IP can be put on an allowlist.
FYI, IA got back and the fetcher they're using for the Media Could URL feed is Mozilla/5.0 (compatible; [archive.org](http://archive.org/)_bot +http://archive.org/details/archive.org_bot)
. They have a robust system for fetching from different IPs, but I don't remember the details.
I think @NullPxl make a good case for keeping things as is for now (with this more standard-looking new UA). Let's hand-off the idea of npr them to research team and see if they have capacity/motivation to take it on. Closing for now. Please re-open if this UA behaves in any newly odd.
It seems that NPR's CDN doesn't like the new UA string @NullPxl came up with, if it comes from our IP (at UMass). It looks like it works if I add something like " Firefox/47.0" to the end. This increases our level of deception (and might only work temporarily). Should we consider doing this after consulting with researchers? Is it worth reaching out to NPR web folk?