digipres / sentinel

The Sentinel watches various data source and updates digipres.org
Apache License 2.0
5 stars 3 forks source link

Add (and stabilise?) list of Macintosh Type/Creatortypes #21

Open anjackson opened 8 months ago

anjackson commented 8 months ago

Tyler Thorsted pointed out there's a database of possible old Macintosh format sigatures in the form of Type/Creator codes at: https://lacikam.co.il/tcdb/

Some contextual info at available at: https://github.com/dgelessus/mac_file_format_docs?tab=readme-ov-file#file-type-and-creator-codes-file-signatures

The TCDB data is available for download in Excel format, but some of the characters seem corrupted, and I'm not sure how to trust the codes themselves (e.g. are these signatures binary but encoded somehow?). Very few records appear to have an associated file extension, so they can't be linked in directly, although the records still could be made searchable.

Given the state of the Excel file, it seems the data would need re-hosting in some form. Not clear how to do that at present.

thorsted commented 8 months ago

Here is a smaller list of Type/Creator codes I have been keeping. Google Spreadsheet Also links to Wikidata's two properties. Type / Creator

anjackson commented 7 months ago

Noting also your converted version: https://github.com/thorsted/Born-Digital-Scripts/tree/main/TC%20Identification @thorsted

anjackson commented 1 month ago

The 'beta' version now integrates a copy of this dataset into a combined SQLite DB. Very early days! Importer is at https://github.com/digipres/sentinel/blob/2024-refresh/foreging/tcdb.py and a very early rough beta Format Index database will be part of the DigiPres Workbench 1.0.0 release next week.