webrecorder / specs

Specifications developed and maintained by the Webrecorder community.
https://specs.webrecorder.net
124 stars 14 forks source link

namaste tags #6

Open atomotic opened 4 years ago

atomotic commented 4 years ago

besides the general metadata that could be contained in webarchive.yaml i would suggest to consider the use of Namaste tags to have some metadata tags exposed directly via filenames (so that wacz bundles could be easily explored with file browser or shells).

So taking as examples OCFL and Namaste Go a wacz directory could contain:

0=wacz_1.0
2=My_awesome_archive
3=20200601
ikreymer commented 4 years ago

I think this would only be useful if the intent is for users to interact with the raw zip file directly and/or manually extract the data.. That could then expose some metadata by running unzip -v.

But, if users are primarily using wacz through a higher level tool, eg. wacz info, then it could display such data from inside the webarchive.yaml or other metadata structure. Right now I'm sort of leaning towards the wacz info style usage as the recommended approach, so I'm not sure if this would be as useful. Of course, wacz info doesn't yet exist.

I think a higher level tool will be needed anyway since raw web data is not easily accessible anyway, eg. someone can't just list the file WACZ or WARCs in a shell and know what web content is in it