leonkt / zotero-memento

Zotero extension that combats link rot by archiving webpages and journal articles.
MIT License
277 stars 15 forks source link

Batch Archives #17

Open BradKML opened 2 years ago

BradKML commented 2 years ago

This is specifically necessary when many webpages are archived.

BradKML commented 2 years ago

Problem A: ArchiveToday would like an honor system, low call count per minute https://github.com/leonkt/zotero-memento/blob/master/chrome/content/scripts/ArchivePusher.js#L32 Problem B: They will have to be iterated through https://github.com/leonkt/zotero-memento/blob/master/chrome/content/scripts/ArchivePusher.js#L30 Problem C: There should be a function preventing Archiving if it has been pre-archived.

BradKML commented 2 years ago

Example idea that misses setTimeout:

    sendReq : function() {
        // Takes the selected items and attempts to archive it.
        var item_set = Zotero.getActiveZoteroPane().getSelectedItems();
        // We attempt to archive it in all these sites; there are often a few that aren't successful.
        var archive_sites = ['archive.li', 'archive.vn','archive.fo', 'archive.md', 'archive.ph',
                             'archive.today','archive.is'];
        for (let item of item_set) {
            for (let site of archive_sites) {
                // This step is to retrieve the submitID.
                var submitIdReq = Zotero.IaPusher.createCORSRequest("GET", 
                                  "https://cors-anywhere.herokuapp.com/https://" + site + "/", false);
                Zotero.IaPusher.setRequestProperties(submitIdReq);
                submitIdReq.send();
                var subId = this.extractSubmitId(submitIdReq.responseText);
                // Push to the archive; takes a few minutes for changes to be reflected on the site.
                var hostUrl = "https://" + site + "/submit/";
                var req = Zotero.IaPusher.createCORSRequest("POST", hostUrl, false);
                Zotero.IaPusher.setRequestProperties(req);
                // ESSENTIAL: we submit data with POST, 
                // (1) url field must be a URL to the page to be archived.
                // (2) submitid field must be the submitID retrieved from earlier on.
                // (3) anyway is optional.
                var params = {
                    "url" : item.url,
                    "submitid" : subId,
                    "anyway" : 1
                };

                req.send(JSON.stringify(params));

                setTimeout(() => { alert(req.responseText); }, 1000);
            }
        }
    }