oduwsdl / MemGator

A Memento Aggregator CLI and Server in Go
https://memgator.cs.odu.edu/api.html
MIT License
55 stars 11 forks source link

Add a flag to allow user-agent spoofing #62

Closed machawk1 closed 8 years ago

machawk1 commented 8 years ago

This would mitigate archives/sources from blocking an instance. IP or some other identifier could still be used as a basis but if MemGator as a user-agent string is treated like how some other services treat curl (i.e., they block script-based requests), allow the user to easily set a spoof flag (e.g., -S) on startup would help prevent this.

ibnesayeed commented 8 years ago

Are we going to have a handful of well-known user-agent strings to randomly pick from? If so, would you mind providing some here?

ibnesayeed commented 8 years ago

I have added three popular browser user-agents which can be modified later and more user-agents can be added if necessary. The functionality is not tested yet using some echo server though.

machawk1 commented 8 years ago

From initial testing, the spoof flag does work but I am getting the same user-agent with each request instead of a random one of the three supported.

echoHeader.js

var express = require('express');
var app = express();

app.get(/.*/, function (req, res) {
  console.log(req.headers)
});

app.listen(3000);

archives.json

[
  {
    "id": "uaTest",
    "name": "User-Agent test",
    "timemap": "http://localhost:3000/",
    "timegate": "http://localhost:3000/"
  }
]

MemGator

$ memgator -a archives.json http://matkelly.com
^C $ memgator -a archives.json -S http://matkelly.com
^C $ memgator -a archives.json -S http://matkelly.com
^C $ memgator -a archives.json -S http://matkelly.com
^C $ memgator -a archives.json -S http://matkelly.com
^C $ memgator -a archives.json -S http://matkelly.com

Echo service

$ node echoHeader.js 
{ host: 'localhost:3000',
  'user-agent': 'MemGator:1.0-rc4 <@WebSciDL>',
  'accept-encoding': 'gzip' }
{ host: 'localhost:3000',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0',
  'accept-encoding': 'gzip' }
{ host: 'localhost:3000',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0',
  'accept-encoding': 'gzip' }
{ host: 'localhost:3000',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0',
  'accept-encoding': 'gzip' }
{ host: 'localhost:3000',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0',
  'accept-encoding': 'gzip' }
{ host: 'localhost:3000',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0',
  'accept-encoding': 'gzip' }
machawk1 commented 8 years ago

1c41460a6b06f83c418519e5871f7ea4baf66f48 fixed the aforementioned problem. 👍