harvard-lil / js-wacz

JavaScript module and CLI tool for working with web archive data using the WACZ format specification.
MIT License
13 stars 3 forks source link

Add logDirectory option to copy logs to WACZ #119

Closed tw4l closed 2 months ago

tw4l commented 3 months ago

Fixes #118

Thanks for taking a look, this is one of the last things needed for Browsertrix Crawler!

tw4l commented 3 months ago
* Is there a common pattern for these log files (extension, content, etc ...)? I would be interested in making that feature less permissive, so that we don't have a "add whatever to WACZ" flag 😅 .

The log files from Browsertrix Crawler are JSON-L files with .log as the extension - happy to add extra validation there. Perhaps we can limit to log or txt extensions for now? I think if there was an issue with the correctness of the log files (not that I've seen that happen, but you never know) it'd still be better to include a file with a formatting issue rather than no logs.

* Would you mind adding docs for this new option to `types.js` so the IDE / JSDoc knows about this new option?

Happy to!

matteocargnelutti commented 3 months ago

@tw4l

The log files from Browsertrix Crawler are JSON-L files with .log as the extension - happy to add extra validation there. Perhaps we can limit to log or txt extensions for now? I think if there was an issue with the correctness of the log files (not that I've seen that happen, but you never know) it'd still be better to include a file with a formatting issue rather than no logs.

That makes sense. Let's start by only accepting .log and .txt files. Thank you :)

tw4l commented 3 months ago

Rebased and updated, thanks!

matteocargnelutti commented 3 months ago

Hey there @tw4l! This is great, thanks! I pushed a commit to suggest an updated naming convention for all options that expect / accept a path to dir - what do you think?

Cheers,

tw4l commented 3 months ago

Hey there @tw4l! This is great, thanks! I pushed a commit to suggest an updated naming convention for all options that expect / accept a path to dir - what do you think?

Hi @matteocargnelutti , makes sense to me! The consistency is nice :) Thanks for this!