harp-tech / protocol

Description of the Harp protocol.
https://harp-tech.org/protocol/BinaryProtocol-8bit.html
MIT License
3 stars 5 forks source link

Strip all large binary files from this repository #19

Closed glopesdev closed 6 months ago

glopesdev commented 6 months ago

Historically the original protocol documents were uploaded in this repository in binary PDF format. This is unfortunate since they are now preserved forever taking up several megabytes in what should be conceptually the smallest repository of all harp-tech.

This proposal is to strip out these legacy old binary PDF files from this repository, archive a copy of the old repo and upload a new cleaned up version of the history. It might seem harsh, but if the project is about to take off, and if this repository is to be included everywhere as a subrepo, then might as well do it now.

glopesdev commented 6 months ago

Running this SO solution we can see the top 10 worst offenders (sorted in ascending order, largest at the bottom):

b44589f1b621  244KiB Synchronization Clock 1.0 1.0 20200712.pdf
6a6ee54fc1f3  246KiB Device 1.2 1.8 20230314.pdf
a6526afd708f  293KiB synchronization clock - physical connectionsch.sch
4d6826929fdc  311KiB Synchronization Clock 1.0 1.0 20200712.docx
60a947ec86a4  459KiB Logo/HarpLogoSmall_Resumed.bmp
88383fb02772  602KiB Binary Protocol 1.0 1.1 20180223.pdf
af60f0bea746  636KiB Device 1.0 1.3 20190207.pdf
f645b10ff861  1.6MiB Logo/HarpLogoSmall.bmp
b9f6c1a461ae  4.1MiB Logo/Previous Versions/HarpLogo.bmp
77b693a2718b  4.2MiB Logo/Previous Versions/harp.bmp

Given this I would strip the history to remove all .bmp files from the repository as it makes little sense to keep them around anyway when we have the vector versions.

glopesdev commented 6 months ago

I was able to trim everything down from 5 MBs to 200kbs by removing all BMP, DOCX and PDF files, it's about 25x reduction in size. My proposal would be to duplicate the current repository into a private read-only harp-tech/protocol-archive repository, then move the filtered snapshot in place of the current repository.

All changes to the protocol are still listed explicitly in the markdown documents since we are doing for now an explicit section at the end with revision changes. I also updated the LICENSE statement for consistency with other core repositories.

If / as soon as this proposal is accepted I can make the push. All internal references to issues, etc should be preserved, but everyone would need to re-clone their local copy.

For reference, here is my rational for removing each of these file types: