gildas-lormeau / single-file-cli

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
GNU Affero General Public License v3.0
540 stars 58 forks source link

No option to disable saved date header #27

Closed Yakabuff closed 1 year ago

Yakabuff commented 1 year ago

https://github.com/gildas-lormeau/SingleFile/issues/1058

Is there a flag to disable the saved date header in the .html from CLI like what was implemented the issue above? It's causing the snapshots to have different hashes despite having the exact same content.

diff f0c76867d8e8e2e998e84f1d21af6fee62004f79dcc810f58e7a4a466061c145.html d8a1d1b260a32f2a4e0e0cdf0c5a73e77b944c11a9d6868bcaa6494fc7ce5a10.html
4c4
<  saved date: Thu Apr 06 2023 20:55:06 GMT-0400 (Eastern Daylight Time)
---
>  saved date: Thu Apr 06 2023 22:57:13 GMT-0400 (Eastern Daylight Time)
Yakabuff commented 1 year ago

Did a bit of investigating:

In single-file-cli-api.js

async function capturePage(options) {
    try {
        let filename;
        const pageData = await backend.getPageData(options);
        if (options.includeInfobar) {
            await includeInfobarScript(pageData);
        }

In back-end/pupeteer.js:

exports.getPageData = async (options, page) => {
return await getPageData(context || browser, page, options);
async function getPageData(context, page, options) {
...
        return await page.evaluate(async options => {
            return await singlefile.getPageData(options);
        }, options);

In singlefile.js in single-file-core:

async function getPageData(options = {}, initOptions, doc = globalThis.document, win = globalThis) {
...
const processor = new SingleFile(options);
return await processor.getPageData();

In Processor#getPageData in single-file-core/single-file-core.js:

async getPageData() {
            const commentNode = this.doc.createComment("\n " + (this.options.useLegacyCommentHeader ? util.COMMENT_HEADER_LEGACY : util.COMMENT_HEADER) +
                " \n url: " + infobarURL +
                (this.options.removeSavedDate ? " " : " \n saved date: " + infobarSaveDate) +
                (infobarContent ? " \n info: " + infobarContent : "") + "\n");

From a cursory glance, it looks as though we can just add a removeSavedDate flag in args.js and a default value in const DEFAULT_OPTIONS without changing any internal logic as we are only passing around this option object?

Yakabuff commented 1 year ago

@gildas-lormeau That seems to work. Adding a removeSavedDate option field in args.js disables the saved date: portion in the html. Should I make a PR?