oduwsdl / MemGator

A Memento Aggregator CLI and Server in Go
https://memgator.cs.odu.edu/api.html
MIT License
55 stars 11 forks source link

Some mementos incorrectly parsed by MemGator #79

Closed machawk1 closed 8 years ago

machawk1 commented 8 years ago

Querying a locally deployed (latest from master) MemGator service for google.com mementos produces two oddly formatted URI-Ms at 20071213220957 and 20071223171907 with the "URIs" being "1&open=" and "s=y", respectively. That is, one URI starts with "1" and the other with "s", neither with "http". These mementos are from Internet Archive, particularly:

http://web.archive.org/web/20071213220957/http://www.google.com/#garage=&showroom=new=0,1&open=

and

http://web.archive.org/web/20071223171907/http://www.google.com/#h=1063,k=active,s=y

at the above datetimes, as reported when obtaining the TM from IA directly. MemGator seems to be getting confused in the parsing and lists everything after the last comma in each URI as its own memento. I first exhibited this from the cdxj endpoint.

/cc @phonedude We discussed this weirdness last Friday, which we can now attribute to MemGator and not an archive's ill-formed TimeMap

machawk1 commented 8 years ago
<http://web.archive.org/web/20071213021252/http://www.google.com/>; rel="memento"; datetime="Thu, 13 Dec 2007 02:12:52 GMT",
<http://web.archive.org/web/20071213220957/http://www.google.com/#garage=&showroom=new=0,1&open=>; rel="memento"; datetime="Thu, 13 Dec 2007 22:09:57 GMT",
<http://web.archive.org/web/20071214014708/http://www.google.com/#forma>; rel="memento"; datetime="Fri, 14 Dec 2007 01:47:08 GMT",

and

<http://web.archive.org/web/20071221034642/http://www.google.com/#quality>; rel="memento"; datetime="Fri, 21 Dec 2007 03:46:42 GMT",
<http://web.archive.org/web/20071223171907/http://www.google.com/#h=1063,k=active,s=y>; rel="memento"; datetime="Sun, 23 Dec 2007 17:19:07 GMT",
<http://web.archive.org/web/20071225064615/http://www.google.com/>; rel="memento"; datetime="Tue, 25 Dec 2007 06:46:15 GMT",```