cidgoh / DataHarmonizer

A standardized browser-based spreadsheet editor and validator that can be run offline and locally, and which includes templates for SARS-CoV-2 and Monkeypox sampling data. This project, created by the Centre for Infectious Disease Genomics and One Health (CIDGOH), at Simon Fraser University, is now an open-source collaboration with contributions from the National Microbiome Data Collaborative (NMDC), the LinkML development team, and others.
MIT License
91 stars 23 forks source link

Compress image assets #413

Closed kennethbruskiewicz closed 7 months ago

kennethbruskiewicz commented 8 months ago

The GIFs in web/images/ are all quite large, including one which is 1.9MB in size.

Compress these assets or find a way to serve them differently.

The goal is to address these asset size warnings by Webpack (data is post GIF compression):

WARNING in asset size limit: The following asset(s) exceed the recommended size limit (244 KiB).
This can impact web performance.
Assets: 
  scripts/main.js (2.14 MiB)
  assets/b5c5404120acd7c59186.gif (505 KiB)
  assets/03c246ffefee8475259e.gif (397 KiB)
  assets/b46a01fe16bbe3d4f437.gif (382 KiB)
  assets/7871ebc22fa80e9ea6f9.gif (266 KiB)
  assets/4b7d9f4c8a07169f6945.gif (397 KiB)
  assets/4517b10aca7d7f197cc5.gif (339 KiB)
  assets/a6c3c9b4b8f1b9e896d8.gif (311 KiB)
  assets/7c4cee9fbf865203dfe2.gif (424 KiB)
  assets/48b3e2e9e173b0fbc3bb.gif (282 KiB)
  assets/e2474f4f46876c76c877.gif (1.13 MiB)
  assets/a02ae24cf5940ba931d2.gif (631 KiB)
  assets/d4b6e20257f17d031e6e.gif (1.07 MiB)
  scripts/246.js (266 KiB)
  scripts/468.js (904 KiB)
  scripts/823.js (365 KiB)
  scripts/900.js (284 KiB)
  templates/grdi/schema.yaml (598 KiB)
  templates/mpox/schema.yaml (274 KiB)
  templates/canada_covid19/SOP.pdf (415 KiB)
  templates/mpox/SOP_Mpox.pdf (592 KiB)
  templates/mpox/SOP_Mpox_international.pdf (581 KiB)
  templates/grdi/SOP.pdf (597 KiB)
  templates/pha4ge/SOP.pdf (463 KiB)
kennethbruskiewicz commented 8 months ago

Manage to get these compression rates thus far:

# First three are anomalous because of running script twice, nonetheless ~95%
changeTemplate.gif: 516660 bytes -> changeTemplate.gif: 516660 bytes, Compression ratio: 100.00%
doubleClickHeaders.gif: 145397 bytes -> doubleClickHeaders.gif: 145397 bytes, Compression ratio: 100.00%
editCopyPasteDelete.gif: 111352 bytes -> editCopyPasteDelete.gif: 111352 bytes, Compression ratio: 100.00%
exportingFiles.gif: 423941 bytes -> exportingFiles.gif: 406620 bytes, Compression ratio: 95.00%
fillColumn.gif: 411698 bytes -> fillColumn.gif: 391523 bytes, Compression ratio: 95.00%
jumpToColumn.gif: 286235 bytes -> jumpToColumn.gif: 272871 bytes, Compression ratio: 95.00%
moreInfo.gif: 434220 bytes -> moreInfo.gif: 406548 bytes, Compression ratio: 93.00%
provenance.gif: 359473 bytes -> provenance.gif: 347445 bytes, Compression ratio: 96.00%
selectingVals.gif: 337848 bytes -> selectingVals.gif: 318008 bytes, Compression ratio: 94.00%
showRows.gif: 474443 bytes -> showRows.gif: 433911 bytes, Compression ratio: 91.00%
showSection.gif: 316474 bytes -> showSection.gif: 288989 bytes, Compression ratio: 91.00%
toggleRequiredCols.gif: 1375259 bytes -> toggleRequiredCols.gif: 1185560 bytes, Compression ratio: 86.00%
validatingCells.gif: 661042 bytes -> validatingCells.gif: 646620 bytes, Compression ratio: 97.00%
versionUpdate.gif: 1212160 bytes -> versionUpdate.gif: 1124046 bytes, Compression ratio: 92.00%

using this script

declare -A original_sizes compressed_sizes                                

for f in *.gif; do
  original_sizes["$f"]=$(stat -c%s "$f")
  ffmpeg -i "$f" -vf "fps=10,palettegen=max_colors=256" -y "temp_palette_${f%.gif}.png"
  ffmpeg -i "$f" -i "temp_palette_${f%.gif}.png" -lavfi "fps=10 [x]; [x][1:v] paletteuse" "temp_${f%.gif}.gif"
  gifsicle --optimize=3 --colors 256 --dither "temp_${f%.gif}.gif" -o "${f%.gif}.gif"
  compressed_sizes["$f"]=$(stat -c%s "${f%.gif}.gif")
  rm "temp_palette_${f%.gif}.png" "temp_${f%.gif}.gif"
done

echo "Compression rates:"
for f in *.gif; do
  compressed_file="${f%.gif}.gif"
  if [[ -f "$compressed_file" ]]; then
    original_size=${original_sizes["$f"]}
    compressed_size=${compressed_sizes["$f"]}
    compression_ratio=$(echo "scale=2; $compressed_size/$original_size*100" | bc)
    echo "$f: $original_size bytes -> $compressed_file: $compressed_size bytes, Compression ratio: $compression_ratio%"
  fi
done
kennethbruskiewicz commented 8 months ago

Branch https://github.com/cidgoh/DataHarmonizer/tree/413-compress-image-assets

kennethbruskiewicz commented 8 months ago

Images in the <root>/images/ directory are left unaffected for reference, as they are not added to the build.

ddooley commented 8 months ago

I think because the images are animations, one isn't really getting much benefit from compression there. The thing is, we have this content packaged in off-line files via the help menu. So I'd say the more fundamental decision is whether to keep the off-line help as is that way, if one is concerned about the memory of that help info. It only gets updated sporadically (no official policy on that). One could simply consider it application overhead.

ddooley commented 8 months ago

Is there a way to just have webpack ignore the size of this, possibly storing in a separate "help info" component file?

kennethbruskiewicz commented 7 months ago

Is there a way to just have webpack ignore the size of this, possibly storing in a separate "help info" component file?

I know what you mean, and I'll look into it. You might be right in that Webpack should focus on code assets rather than image assets instead. Some of this is being reported because of the use of module imports for the images.