quran / quran_android

a quran reading application for android
http://android.quran.com
GNU General Public License v3.0
2.01k stars 890 forks source link

Optimize translation databases #701

Closed ahmedre closed 7 years ago

ahmedre commented 8 years ago

via an email from the brothers at greentech, may Allah reward them greatly - i've copied/pasted parts of their email here:

I have seen that your sqlite databases can be optimised more in order to reduce their download sizes:

  • execute: insert into verses(verses) values('optimize') - This reindexes and optimizes the fts3 table
  • compact the database - This removes spaces wasted by deleted records Using this method I was able to reduce about 0.5 Mb to 3mb in each of the databases
ahmedre commented 8 years ago

A feature of this data structure is that at any time the database may contain not one index b-tree, but several different b-trees that are incrementally merged as rows are inserted, updated and deleted. This technique improves performance when writing to an FTS table, but causes some overhead for full-text queries that use the index.

  • we can use vacuum to clean the database.

as the brother mentioned in his email, we can use sqlitebrowser for this as well, but will try command line first since we can just write one script to do the work automatically.

ahmedre commented 7 years ago

masha'Allah, this showed some great wins - overall, ~56mb of savings:

Translation Original End
quran.al.ahmeti.db 2.7M 2.2M
quran.al.db 3.4M 2.2M
quran.am.sadiq.db 3.1M 2.6M
quran.ar.baghawy.db.zip 5.4M 4.4M
quran.ar.ibnkathir.db.zip 14M 7.3M
quran.ar.qurtubi.db.zip 13M 11M
quran.ar.sa3dy.db.zip 4.5M 3.7M
quran.ar.tabari.db.zip 22M 17M
quran.ar.tanweer.db.zip 19M 16M
quran.ar.waseet.db.zip 13M 9.6M
quran.az.musayev.db 2.6M 2.0M
quran.bn.mkhan.db 4.1M 3.7M
quran.bs.korkut.db 2.3M 1.8M
quran.de.bubenheim.db 2.6M 2.1M
quran.dv.db.zip 1.3M 1.1M
quran.en.haleem.db 2.4M 1.8M
quran.en.khanhilali.db 3.0M 2.5M
quran.en.muftitaqi.db 3.1M 2.1M
quran.en.pickthall.db 2.4M 1.9M
quran.ensi.db 2.4M 1.9M
quran.en.transliteration.db 1.8M 1.5M
quran.en.yusufali.db 2.6M 2.1M
quran.es.db 2.7M 1.8M
quran.es.garcia.db 2.6M 2.0M
quran.farsi.db 4.6M 2.8M
quran.fr.db 3.2M 2.1M
quran.ha.db 3.2M 2.1M
quran.hi.db.zip 1.4M 1.1M
quran.id.db 3.7M 2.3M
quran.it.db 2.9M 1.9M
quran.ku.asan.db 5.4M 4.7M
quran.ml.abdulhameed.db 6.0M 5.3M
quran.ml.karakunnu.db 5.7M 4.9M
quran.ms.db 4.8M 2.9M
quran.muyassar.db 5.3M 5.0M
quran.nl.keyzer.db 2.7M 2.2M
quran.pt.db 2.9M 1.9M
quran.ru.abuadel.db.zip 1.9M 1.6M
quran.ru.kuliev.db 4.4M 2.7M
quran.ru.muntahab.db 6.4M 5.6M
quran.so.abduh.db 2.2M 1.8M
quran.sv.bernstrom.db 2.8M 2.2M
quran.sw.barwani.db 2.2M 1.8M
quran.ta.jantrust.db 6.1M 5.5M
quran.th.db 6.6M 5.6M
quran.tr.db 2.4M 2.0M
quran.tr.diyanet.db 3.4M 2.2M
quran.ug.saleh.db 4.3M 3.7M
quran.urdu.junagarhi.db 3.4M 2.8M
quran.urdu.maududi.db.zip 1.3M 969K
quran.uz.db 7.6M 4.4M
quran.zh.db 2.6M 1.8M
Total 241M 185M

for future reference, the script to do this:

#! /bin/zsh

TABLE=verses
for j in **/*.db; do
  sqlite3 $j "insert into $TABLE($TABLE) values('optimize'); vacuum;"
done

for i in **/*.zip; do
  mkdir tmp
  cd tmp
  unzip ../$i
  for j in **/*.db; do
    sqlite3 $j "insert into $TABLE($TABLE) values('optimize'); vacuum;"
  done
  zip -r $i:t .
  cp $i:t ../$i
  cd ..
  rm -rf tmp
done
ahmedre commented 7 years ago

i only ran this script for translations (since the first part only applies to fts3 databases) - the vacuum part applies to anything though, so will update after running just the vacuum portion for the rest of the databases.

ahmedre commented 7 years ago

a total savings of 4mb on ayahinfo databases, and 1mb on audio databases. also, forgot to mention the savings of 0.1mb on each of the arabic and naskh arabic databases.