Open AlttiRi opened 3 years ago
With [{hostname}] {YYYY}.{MM}.{DD}—{title}
:
With just {title}
:
Note: it are the old screenshots, so here hostname is with "www".
Also I use such approach to name files in my userscripts which download files.
In particular, in my public userscript for Twitter. Here is a detailed description of the advantages of such file naming: https://github.com/AlttiRi/twitter-click-and-save#filename-format
Well. Thanks for the idea and the detailed description. However my extension is really small. And I don't think I want to add complex logic here.
But it's only a few lines of code in background.js with simple logic:
const {hostnameTrimmed, siteTitle, YYYY, MM, DD} = getFilenameParts(tab)
const filename = `[${hostnameTrimmed}] ${YYYY}.${MM}.${DD}—${siteTitle}.mht`
let blob = await toPromise(chrome.pageCapture.saveAsMHTML, { tabId: tab.id })
download(filename, await patchSubject(blob))
function getFilenameParts(tab) {
const hostname = new URL(tab.url).hostname
const hostnameTrimmed = hostname.startsWith("www.") ? hostname.slice(4) : hostname
const date = new Date()
const YYYY = date.getFullYear()
const MM = (date.getMonth() + 1).toString().padStart(2, "0")
const DD = date.getDate().toString().padStart(2, "0")
const siteTitle = sanitize(tab.title)
return {hostname, hostnameTrimmed, siteTitle, YYYY, MM, DD}
}
Diff (13 insertions, 1 deletion):
async function save(tab) {
+ const {hostnameTrimmed, siteTitle, YYYY, MM, DD} = getFilenameParts(tab)
+ const filename = `[${hostnameTrimmed}] ${YYYY}.${MM}.${DD}—${siteTitle}.mht`
- const filename = `${sanitize(tab.title)}.mht`
let blob = await toPromise(chrome.pageCapture.saveAsMHTML, { tabId: tab.id })
download(filename, await patchSubject(blob))
+
+ function getFilenameParts(tab) {
+ const hostname = new URL(tab.url).hostname
+ const hostnameTrimmed = hostname.startsWith("www.") ? hostname.slice(4) : hostname
+ const date = new Date()
+ const YYYY = date.getFullYear()
+ const MM = (date.getMonth() + 1).toString().padStart(2, "0")
+ const DD = date.getDate().toString().padStart(2, "0")
+ const siteTitle = sanitize(tab.title)
+ return {hostname, hostnameTrimmed, siteTitle, YYYY, MM, DD}
+ }
function sanitize(filename) {
return filename.replace(/[<>:"/\\|?*\x00-\x1F~]/g, '-')
}
IMO leave it as-is, and if someone wishes to add flexibility they can fork into a new extension that has their preferred filename format. As an exercise somebody could add options like a checkbox to give the user the CHOICE to use the new format.
The extension uses site's title as a filename. I want to suggest a much better template for the filename that will improve the file organisation.
TL'DR
The template should be follow:
[{hostname-without-www}] {YYYY}.{MM}.{DD}—{title}
The example results with this name patter: "[developer.mozilla.org] 2021.01.21—Cross-Origin-Opener-Policy - HTTP - MDN.mht" "[en.wikipedia.org] 2021.01.21—High Efficiency Image File Format - Wikipedia.mht" "[javascript.info] 2021.01.21—Generators.mht" "[reddit.com] 2021.09.28—WD Blue (New 2018+) Line Models Explained - SMR - Greens (v1) - DataHoarder.mht"
I have already written about it for an other similar extension here and here, so just copy paste the text here:
For better file navigation (to easily find the desired file), files should be organized. This can be achieved if, when sorting by name (alphabetically), the files are both grouped and sorted. To do this, you need the correct (special) file name.
If the file name contains only the title, the files will be shuffled randomly, mixed with the other files (not mhtml),
Files will be organized if they are grouped by hostname and sorted by date. This can be achieved if the file name consists of the following parts: first the hostname, then the date, and at the end – the title.
And in order for mhtml files not to be shuffled with other files, you should use the "prefix". The same first character. Which should preferably be neither a letter nor a number, so as not to be among the other files, and get a higher priority when sorting.
You can just add, for example,
#
first, but I find it better to tag the site name with[]
.[hostname]
Next is the date. The only correct format:
yyyy
first, thenmm
, thendd
. (The other types may looks misleading: "10.12
is itmm.dd
ordd.mm
?", or they are not suited for alphabet sorting) There are several options.2021-01-22
or2021.01.22
. The line with dots will be shorter, you can select the entire date with a double click, and IMHO it looks nicer.And then the title. You can separate it with just space
`, or you can use "—"
—` (Alt + 0151).` or dashes
-`.