iipc / openwayback

The OpenWayback Development
http://www.netpreserve.org/openwayback
Apache License 2.0
483 stars 274 forks source link

Resource Not In Archive #431

Closed hasanli-orkhan closed 4 years ago

hasanli-orkhan commented 4 years ago

Hello, dear developer! I have installed this app - http://tomcat.md7.info:8888/wayback/ When I try to search somthing (for example - https://www.nytimes.com/) I get this -

Resource Not In Archive The Resource you requested is not in this archive.

However here is the snapshots of this resource: Wayback Machine

How I can fix this? Thanks in advance!

ldko commented 4 years ago

Hi @azerphoenix, the link to https://web.archive.org/web/*/https://www.nytimes.com/ for the snapshots of the resource held by the Internet Archive and made available in their instance of the Wayback Machine does not correlate to what will be available in your own deployed instance of OpenWayback. What configuration did you do when you set up OpenWayback? Did you provide WARC files or an index to WARC files that include a capture of www.nytimes.com?

hasanli-orkhan commented 4 years ago

Hello, dear developer! I have just installed OpenWayback - https://github.com/iipc/openwayback/wiki/How-to-install Edited this line of code

wayback.url.host.default=localhost
wayback.url.port.default=8080

Do I need generate CDX files?

ldko commented 4 years ago

Generally OpenWayback is used to provide access to WARC files you have. By the default configuration, you can create the directories ${wayback.basedir}/files1/ and ${wayback.basedir}/files2/ on your system and place WARC files in them. They will be automatically indexed that way, and the URLs in your WARCs should then be findable in your instance of OpenWayback. You could alternatively modify the configuration to use CDX file(s) that you generate from your WARC files and a path-index.txt file that indicates where those WARC files are to be found.

hasanli-orkhan commented 4 years ago

Oh, thank you so much. Have a nice day!