ikreymer / webarchiveplayer

NOTE: This project is no longer being actively developed.. Check out Webrecorder Player for the latest player. https://github.com/webrecorder/webrecorderplayer-electron) (Legacy: Desktop application for browsing web archives (WARC and ARC)
GNU General Public License v3.0
195 stars 20 forks source link

coursera warc files not really showing content #36

Closed nurp closed 6 years ago

nurp commented 6 years ago
  1. I downloaded and extracted the last file in this list: https://gist.github.com/marai2/b548ce70b6af4789522c6ef5e54c6bbf file is: https://archive.org/download/archiveteam_coursera_20160709070402/coursera_20160709070402.megawarc.warc.gz

  2. Downloaded and run: Web Archive Player 1.4.7 (pywb 0.33.1) Archive Player Server running at: http://localhost:8090/

  3. Opened the warc file with web archive player. Chrome opened the url with 11548 links in it.

  4. I clicked on tissue101 whose link is: http://localhost:8090/20160629170142/https://www.coursera.org/course/tissue101

  5. It shows an empty Coursera page. Same happens with other course links. Or it loads forever.

I am using Mac OS 10.13.4 High Sierra. What can I do to use these warc files? Is there a way to see content as folders? I want to see if there are any video files at least. Thanks

ikreymer commented 6 years ago

Sorry, this project is no longer being maintained, but you have two options:

1) Run the latest Webrecorder Player from: https://github.com/webrecorder/webrecorderplayer-electron/releases

Webrecorder Player is a desktop app that makes it simpler to open your own WARCs, and has replaced Web Archive Player.

2) You can run your own pywb instance using the latest version of pywb: http://pywb.readthedocs.io/en/latest/ This will allow you to set up your own server, similar to above, but with the latest version.

If you are still seeing issue, please open an issue on the pywb github.