nickbabcock / Pdoxcl2Sharp

A Paradox Interactive general file parser
MIT License
39 stars 13 forks source link

Compressed Saver #2

Closed nickbabcock closed 11 years ago

nickbabcock commented 11 years ago

Many characters in a typical file are superfluous, such as newlines, spaces, and others. These can be eliminated without loss of data. A properly written parser couldn't tell the difference between a "regular" and "compressed" file.

As proof of concept, I took a 22MB file and reduced it to 17MB by running these shell commands:

sed -e 's/^[    ]*//' inFile | \
tr '\r\n' ' ' | \
sed -r -e 's/ +/ /g' -e 's/= \{ ?/=\{/g' -e 's/ ?\} ?\}/\}\}/g' > outFile

Explanation of commands:

Note, this might not be all the ways to compress a file. I haven't tested it, but it should be possible to delete spaces if they occur after any equals.

The compressed saver should have the same interface as the current saver, even though this means ignoring ValueWrite enums passed to the write functions.