internetarchive / wayback

IA's public Wayback Machine (moved from SourceForge)
741 stars 133 forks source link

java.lang.IllegalArgumentException for certain websites #192

Open comschmid opened 5 years ago

comschmid commented 5 years ago

For certain websites, the Wayback CDX Server API simply returns java.lang.IllegalArgumentException: fromKey > toKey

Example URL that shows this behaviour: https://web.archive.org/cdx/search/cdx?url=www.bs.ch&output=json&limit=1

Then, it seems like this website is not available: https://archive.org/wayback/available?url=www.bs.ch

On the other hand, the web interface works fine: https://web.archive.org/web/*/https://www.bs.ch

Also, other websites don't show this error: https://web.archive.org/cdx/search/cdx?url=www.google.com&output=json&limit=1

This was working some month ago fine, but I don't know exactly since when this error occurs.

Is the the Wayback CDX Server API still supported or should I switch to another API with similar functionality?

Thank you for any advice!

ghsnd commented 5 years ago

Hi, Just to confirm that I have the same issue with .be domains (while .nl and .fr domains for instance seem to work fine).

Dezzign commented 5 years ago

Yes same issue for virtually every type of extension like: .com.au .co.uk .net.au

etc

ghsnd commented 5 years ago

I know it's not the same, but their memento service is still up and running. E.g. requesting a timemap for a certain website (http://www.vrt.be):

curl 'http://web.archive.org/web/timemap/json/http://www.vrt.be'

[["urlkey","timestamp","original","mimetype","statuscode","digest","redirect","robotflags","length","offset","filename"],
["be,vrt)/","19981212033925","http://www.vrt.be:80/","text/html","200","UER5RW3PHIZG7T4KCQD2P3CPDQPHOLJE","-","-","2668","2082807","slash-913417727-c/slash-913433960.arc.gz"],
["be,vrt)/","19990125092619","http://vrt.be:80/","text/html","200","IXFQBTN3IZTEDJTJC5UPANHSHV5LRWKG","-","-","2674","12619230","slash-913417727-c/slash_19990124232053-917256353.arc.gz"],
...
comschmid commented 5 years ago

It seems like it's working again!