ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.31k stars 129 forks source link

--finished-warc-dir= not working for me #161

Closed BradCoffield closed 4 years ago

BradCoffield commented 4 years ago

I'm running grab-site from within a subfolder in my system. If I run it regular it will create a new folder inside the subfolder I'm working in. When I create a folder in advance and then pass that to --finished-warc-dir= I get an error that what I've specified isn't a directory....

Perhaps its a syntax issue on my end but I think I've tried every possible iteration of --finished-war-dir= (like, putting the directory name in quotes, not in quotes, adding forward slashes, etc)

ivan commented 4 years ago

--finished-warc-dir= should be an absolute path to a directory that already exists. README needs an update to mention this, sorry. If an absolute path still does not work, can you provide your grab-site arguments and error output?

Using --finished-warc-dir= won't stop grab-site from creating a new directory for the crawl itself, though; it's just for finished .warc.gz and .cdx files.

BradCoffield commented 4 years ago

That did it! Thank you :)