Closed Coloradohusky closed 3 years ago
For playback to work, Twitter should be grabbed like this to get the old non-React layout while it still exists:
# Get the old site instead of the React site
twitter_ua="NOT Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
grab-site --ua "$twitter_ua" --1 [... -i URLS_FILE or some starting URLs ...]
No --delay
should be needed either.
Hope that helps, let me know.
Works, thanks!
Saved a list of Twitter pages (eg https://twitter.com/foofighters/status/1329662049639571457) with grab-site,
Did the page actually download, but replayweb.page just can't show it, or can it actually not be seen?
WARC below:
twitter.com-foofighters-status-1329662049639571457-2020-11-20-f3367dc6-00000.warc.gz
grab-site --wpull-args "--monitor-disk" --wpull-args "--limit-rate 100000" --no-dupespotter sites.txt --delay 250-375 --1
. Downloaded them all just fine... or so I thought. Viewing the pages with something like replayweb.page shows "Something went wrong, but don't fret - it's not your fault."