msiebuhr / charcod.es

Small webpage for finding the odd unicode char code.
https://charcodes.netlify.app/
ISC License
18 stars 3 forks source link

Progressive download/parsing/indexing of codepoints #16

Open msiebuhr opened 12 years ago

msiebuhr commented 12 years ago

The application halts slow machines/devices quite a lot, so perhaps we should split the index into smaller parts - possibly with various-size chunks, so we can adapt to faster/slower machines and network connections. Eg. naming data-FROM%-TO%.json:

#5% chunks
data-0-5.json
data-5-10.json
…

#10 % chunks
data-0-10.json
data-10-20.json
…

#25 % chunks
data-0-25.json
data-25-50.json
…

#50 % chunks
data-0-50.json
data-50-100.json

Then the client could start out downloading data-0-10.json and parse it. If that takes to long, degrade to 5% chunks, and if It's fast, upgrade to 25%-chunks.

We'd have to have some more data lying around (about 2MB per size), and - more difficult - figure out a dynamic download client.

Munter commented 12 years ago

An alternative could be offloading the heavy lifting stuff to web workers to keep the interface responsive.

msiebuhr commented 8 years ago

Another way could be to include some top percentage of the codes in the initial download.

Picking all these out from the main data-set weighs in at 11KB gzipped (66KB plain), which would still be quite a win.

jq '[.[] | select(.b == "ascii" or .b == "misc_symbols" or .b == "misc_pictographs")]' -c data.json  | gzip | wc -c
msiebuhr commented 8 years ago

BTW. misc_pictographs alone would weigh in a 7KB gzipped.

Considering the background image is 11KB compressed and the JS-bundle is 45KB, I think we'd be OK all of the proposed subsets.