danny0838 / firefox-scrapbook

ScrapBook X – a legacy Firefox add-on that captures web pages to local device for future retrieval, organization, annotation, and edit.
Mozilla Public License 2.0
324 stars 65 forks source link

Spider #288

Closed Davidgithub1 closed 4 years ago

Davidgithub1 commented 4 years ago

What's a good spider tool? Every tool I've used was not able to download websites that ScrapbookX was able to.

danny0838 commented 4 years ago

How would you define "a good spider tool"?

For a general case, HTTrack seems to be a quite good one.

Davidgithub1 commented 4 years ago

I've used HTTrack but it doesn't download many websites that ScrapbookX is able to.

danny0838 commented 4 years ago

Which websites? Please be more specific.

Davidgithub1 commented 4 years ago
  1. HTTrack is not easy to use. ScapbookX is easy to use and takes 1 seconds to start downloading a website.
  2. HTTrack has trouble preserving the CSS and look of many websites I've tried to download. ScapbookX has always been able to preserve the look / style of every website.

That's why I use ScapbookX !

danny0838 commented 4 years ago

SBX does not support preserving original file structure, which is what some user care about most. On the other hand, HTTrack does not support advanced DOM rewriting, such as not saving scripts, videos, etc. It really depends on your use case, and just choose the tool that works best for you.

Davidgithub1 commented 4 years ago

ScapbookX works best for me. But I'm afraid websites will stop supporting old firefox. Can you create a version that uses Headless Chrome to save an entire website (follow links)?

danny0838 commented 4 years ago

Handling headless Chrome is too much work, we are unlikely going to do that.

danny0838 commented 4 years ago

ScrapBook X already has this feature. As for this feature in WebScrapBook, we'll track in its source repo.