Closed aelmahmoudy closed 9 years ago
I was wrong, for some reason _idx1 was not created for merge.db. After creating it (and running VACUUM), its size increased to 732 MB. It seems merge.db isn't created properly by gen_lacells script, as I get those errors:
Error: near line 12: unrecognized token: "#" Error: near line 16: unrecognized token: "#" Error: near line 36: unrecognized token: "#"
Lines 111 through 136 of gen_lacells should be deleted. Or the first "#" on each of those lines should be removed.
With those lines removed then there will be duplicate records where OpenCellID and Mozilla Location Services both have data on the same cell tower. WIth the "#" removed then the duplicate records will be removed.
For my gsm-backend, deleting those lines should result in the same on-phone database a the backend will create for itself using the tools in the settings menu. If you are using this database with the microG project's original backend you should remove the duplicates.
I only put a "#" in front of each line to keep the lines around to make it easier for me to cut and paste things when I was testing. Unfortunately you can't just re-run the script in whole if there is a bug as the OpenCellID project limits downloads to one per day. :(
You are correct, the samples field is missing if you do the merge.
I will do two things:
On the mention of uG's original backend. cells.db that was used by the old backend (ie. before UnifiedNlp) was about 65 MB although it was for the whole world, why is your db so large even after removing duplicates ?
You can use sqlite3 on your laptop/desktop computer to examine the database files to see their structure and how many records are in each file. For example, the lacells.db that is on my phone at present:
$ adb pull /sdcard/.nogapps/lacells.db 3912 KB/s (101584896 bytes in 25.357s) $ sqlite3 lacells.db SQLite version 3.7.7 2011-06-25 16:35:41 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .tables cells sqlite> .schema cells CREATE TABLE cells(mcc INTEGER, mnc INTEGER, lac INTEGER, cid INTEGER, longitude REAL, latitude REAL, accuracy REAL, samples INTEGER, altitude REAL); CREATE INDEX _idx1 ON cells (mcc, mnc, lac, cid); CREATE INDEX _idx2 ON cells (lac, cid); sqlite> select count(*) from cells; 1166872 sqlite> .quit $
So there are 1.16 million records in my current database just for the United States. Downloading the North America cell database from the announcement on XDA developer's forum at http://forum.xda-developers.com/showthread.php?t=1715375 gives me:
$ sqlite3 cells-northa.db SQLite version 3.7.7 2011-06-25 16:35:41 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .tables cells sqlite> .schema cells CREATE TABLE cells(lat REAL, lon REAL, mcc INTEGER, mnc INTEGER, lac INTEGER, cellid INTEGER); sqlite> select count(*) from cells; 736083 sqlite> .quit $
So all of North America including Canada and maybe Mexico has less than 3/4 million towers in their old database with fewer fields and no indices.
Pushed new scripts with fixes for issues (other than raw size of database) identified in this thread.
I ran gen_lacells it produced merge.db with a size of 550 MB (I didn't filter out any MCC/MNC) Then I ran the commented sqlite code in gen_lacells: .headers on .mode csv .output merged.csv SELECT * FROM cells ORDER BY mcc, mnc, lac, cid; .quit
then ran: ./merge_duplicate_towers > towers.csv
the resulting towers.csv (415 MB) was smaller than merge.csv (484 MB)
then I ran the rest of commented sqlite commands: sqlite3 lacells.db <<! CREATE TABLE cells(mcc INTEGER, mnc INTEGER, lac INTEGER, cid INTEGER, longitude REAL, latitude REAL, altitude REAL, accuracy REAL); .mode csv .import towers.csv cells CREATE INDEX _idx1 ON cells (mcc, mnc, lac, cid); CREATE INDEX _idx2 ON cells (lac, cid); VACUUM; .quit !
Yet the resulting lacells.db was 691 MB ! Although I notice that this database doesn't have the 'samples' column !