Closed LRN closed 1 year ago
Now, it's quite possible that the dictionary is poorly-formed
it is. Here's the header (retrieved with slob
or aard2-web):
id: 63629b7e9f1940889df4b89c24382f0e
encoding: utf-8
compression:
blob count: 58522
ref count: 223578
compression name is missing
Hello. It's my dictionary, and it was intentionally stored w/o compression:
import slob
with slob.create(filename, compression="") as slb:
...
I think slob library does allow it:
Empty value means bins are not compressed.
I think slob library does allow it:
Indeed. But Java implementation doesn't support this :)
Using zlib should incur negligible overhead though, even if it doesn't ultimately decrease dictionary size, no reason not to use it. It seems to me that use of TIFF is way more problematic. Dictionary content is typically rendered/viewed with a web browser (desktop or embedded like WebView on Android) and no mainstream browser supports TIFF (other than perhaps Safari?).
See https://github.com/latin-dict/Gesner1749/releases
Now, it's quite possible that the dictionary is poorly-formed, but i have no way to check that.