belak / go-seabird

A simple IRC bot framework
MIT License
13 stars 4 forks source link

Archive plugin for news sites #182

Open seabird-bot opened 7 years ago

seabird-bot commented 7 years ago

Filed by djt in #main

djt-code commented 7 years ago

Multiple design options for this that I can think of:

  1. Allow users to type !archive to archive the last link in the channel. Allow users to type !archive <link> to archive a specific link.
  2. Have a list of websites to auto archive for, such as news sites, that would be configurable in the config file.
  3. Combination of above functionality.
djt-code commented 7 years ago

Implementation details

Archive.fo has a POST endpoint at https://archive.fo/submit/

This takes the two following parameters of Content-Type application/x-www-form-urlencoded:

Possible response codes: 200, 307

We should send the archive URL to IRC in the query/channel it was requested from. The archive links look like shortened URLs, for ex: https://archive.fo/3AKV6

200 Response example for newly archived pages

Headers:

Cache-Control: private, no-cache, no-store, must-revalidate, maxage=0 Content-Encoding: gzip Content-Length: 243 Content-Type: text/html;charset=utf-8 Date: Sat, 18 Feb 2017 09:51:10 GMT Expires: Sat, 01 Jan 2000 00:00:00 GMT Pragma: no-cache Server: cloudflare-nginx X-Firefox-Spdy: h2 refresh: 0;url=https://archive.fo/FuBOT

Body

Contains a JS window location redirect to the page in the refresh header in the response. No specific example will be given here, as the required data to implement this is in the headers.

307 Response example for previously archived pages

Headers:

Cache-Control: private, no-cache, no-store, must-revalidate, maxage=0 Content-Length: 0 Date: Sat, 18 Feb 2017 10:07:28 GMT Expires: Sat, 01 Jan 2000 00:00:00 GMT Location: https://archive.fo/3AKV6 Pragma: no-cache Server: cloudflare-nginx X-Firefox-Spdy: h2