There are three scripts in this repo, semantic_location_parser.py
, records_location_parser.py
, and full_location_history_parser.py
.
Each file scans through the Takeouts directory and parses the Google location data.
The resulting csv files can be used in ArcGIS for spatial analysis and visualization. If you want to see your data visually fast, then convert the csv to a macro enabled .xlms
file and use Excel's mapping tool.
.zip
file. Unzip the file into a new directory.python semantic_location_parser.py
.semantic_location_history.csv
.Google Takeout Location History data consists of two types of data: raw location history data and semantic location history data.
E:.
| location_history_parser.py
|
\---Takeout
| archive_browser.html
|
\---Location History
| Records.json
| Settings.json
| Tombstones.csv
|
\---Semantic Location History
+---2022
| 2022_DECEMBER.json
| 2022_NOVEMBER.json
|
\---2023
2023_FEBRUARY.json
2023_JANUARY.json
The semantic_location_parser.py
script is used to parse Google Takeout Location History data into a csv file. The script can extract the timestamp
, address
, placeId
, name
, latitudeE7
and longitudeE7
values from the semantic location history data and store them in a csv file.
The semantic location history data is found in the Semantic Location History
folder and consists of more high-level and processed information compared to the raw location history data found in the records.json
file. This data is partitioned by year in different subfolders and by month in different JSON files. Inside each semantic JSON file we can find a single flat timelineObjects
array.
The records_location_parser.py
is used to pull the less refined location data that is collected. This data can be used to highlight what roads you traveled along rather than just what location you visited.
The full_location_history_parser.py
parses the same data as the semantic_location_parser.py
, but it also parses the records.json
file in the same script. The two files are appended together on the timestamp
, lat
, and lon
columns. It's not the best way to store this data and not adivsed to be used.
This data set will have the following columns:
epoch_time
: Useful for APIstimestamp
: sample start time from devicedate_str
: timestamp to datelat
: latitudinal coordinate. The values need to be divided by 107 to be in the expected range.lon
: logitudinal coordinate. The values need to be divided by 107 to be in the expected range.address
: Best guess address by Googleplaceid
: Google place idname
: Name of location if it existsalt
: altitudeactivity_type
: Most confident activity typeplatformtype
: Platform used to access Google servicesfile_path
: Which folder the data came fromsys_time
: Comes from the timestamp
key in the records.json
, unique.The script only extracts specific values from the raw location history data and the semantic location history data and stores them in their respective csv files. If you need to extract additional information, you may need to modify the script accordingly.