theodi / open-orgn-services

Services that support ODI's operation as an open organisation.
MIT License
3 stars 0 forks source link

Customise the User-Agent string on the polling bot #47

Open JeniT opened 11 years ago

JeniT commented 11 years ago

Custom User-Agents help people filter you out, find out about what the bot is doing, and get in touch with you if the bot starts misbehaving, so it's good practice to use them. I'm thinking we put this in now given that the same polling code may well be used to poll other services & websites in the future.

Best practice seems to be to have a string like:

[codename]/[version] ([human-readable name]; [url]; [email])

eg

AskAboutOil/0.06-rcp (Nutch; http://www.nutch.org/docs/en/bot.html; nutch-agent@askaboutoil.com)
tomheath commented 11 years ago

@JeniT @Floppy @pikesley If we're making a broader polling framework, should we split it out in to a new repo so people can fork just that if they want to?

Floppy commented 11 years ago

I don't think we're building a framework; we're just calling individual APIs from separate jobs, ideally using pre-existing gems. That will make setting the user-agent potentially a bit more complicated, but it's still a lot of time saved.

tomheath commented 11 years ago

+1 for reuse, but I just wonder if this thing will take on a life of its own so deserves a dedicated repo. I can see heaps of other applications for it in stuff we wanna do. I'll have a better idea of the amount of custom code though after doing the code review.

Floppy commented 11 years ago

We currently have four user agents represented in the cassettes:

The two OAuth ones are simply because some cassettes are a bit older, before we updated. Anyway, we'll have to work out how to change all of these to what we want. It may not be possible in all cases, but we can have a go.

There may also be different HTTP libraries in use within the different gems. I can see the following in the Gemfile on a cursory inspection:

JeniT commented 11 years ago

I'm happy dropping this from this sprint. It's a nice-to-have enhancement, not a vital feature.