Naunter / BT_BlockLists

Transmission block list
The Unlicense
920 stars 46 forks source link

Add removal of duplicate lines to the build #8

Closed scolby33 closed 2 years ago

scolby33 commented 2 years ago

Use sort --unique to filter the downloaded blocklists before compressing them. This saves about 24MB uncompressed and 7MB compressed.

$ du -h bt_blocklists uniq bt_blocklists.gz uniq.gz
 54M    bt_blocklists
 30M    uniq
 15M    bt_blocklists.gz
8.1M    uniq.gz

https://github.com/scolby33/BT_BlockLists/runs/4661288837 example run, failed in the release step because I don't have access to this repo, but you can see in the update on master that it worked as expected.

Naunter commented 2 years ago

Aha! sort --unique! Why I not remember this one!

I was going to use awk + sed, it is a more complicated but not accurate and dump way....