ryangriggs / GoogleTimelineMapper

Map and Browse Google Timeline Semantic Location History data
Apache License 2.0
40 stars 4 forks source link

support the new json schema #8

Open dmd opened 5 months ago

dmd commented 5 months ago

In the new on-device-only Location History, Google has changed the JSON schema.

ryangriggs commented 5 months ago

Does the app fail to work with the new schema?

dmd commented 5 months ago

Yes, it refuses to even load it.

"Invalid data: must contain 'timelineObjects' array."

dmd commented 5 months ago

(I'm a little surprised that you wrote this just last week, given Google Timeline is being sunsetted literally as we speak in favor of the new on-device version - were you unaware?)

ryangriggs commented 5 months ago

Wrote the app a few months ago, another user pitched in last week to add features.

How did you obtain your location history data? Did you use Takeout or another method?

dmd commented 5 months ago

No, Takeout is no longer supported for Location History. You have to export it from the app on your phone now.

https://support.google.com/maps/answer/14169818?co=GENIE.Platform%3DDesktop&oco=1

https://www.androidauthority.com/google-maps-killing-timeline-web-access-3449017/

https://gizmodo.com/google-maps-timeline-app-browser-1851520501

IMG_7829

dmd commented 5 months ago

The new schema appears to be:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Generated schema for Root",
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "endTime": {
        "type": "string"
      },
      "startTime": {
        "type": "string"
      },
      "visit": {
        "type": "object",
        "properties": {
          "hierarchyLevel": {
            "type": "string"
          },
          "topCandidate": {
            "type": "object",
            "properties": {
              "probability": {
                "type": "string"
              },
              "semanticType": {
                "type": "string"
              },
              "placeID": {
                "type": "string"
              },
              "placeLocation": {
                "type": "string"
              }
            },
            "required": [
              "probability",
              "semanticType",
              "placeID",
              "placeLocation"
            ]
          },
          "probability": {
            "type": "string"
          }
        },
        "required": [
          "hierarchyLevel",
          "topCandidate",
          "probability"
        ]
      },
      "activity": {
        "type": "object",
        "properties": {
          "probability": {
            "type": "string"
          },
          "end": {
            "type": "string"
          },
          "topCandidate": {
            "type": "object",
            "properties": {
              "type": {
                "type": "string"
              },
              "probability": {
                "type": "string"
              }
            },
            "required": [
              "type",
              "probability"
            ]
          },
          "distanceMeters": {
            "type": "string"
          },
          "start": {
            "type": "string"
          }
        },
        "required": [
          "end",
          "topCandidate",
          "distanceMeters",
          "start"
        ]
      },
      "timelinePath": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "point": {
              "type": "string"
            },
            "durationMinutesOffsetFromStartTime": {
              "type": "string"
            }
          },
          "required": [
            "point",
            "durationMinutesOffsetFromStartTime"
          ]
        }
      },
      "timelineMemory": {
        "type": "object",
        "properties": {
          "destinations": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "identifier": {
                  "type": "string"
                }
              },
              "required": [
                "identifier"
              ]
            }
          },
          "distanceFromOriginKms": {
            "type": "string"
          }
        },
        "required": [
          "distanceFromOriginKms"
        ]
      }
    },
    "required": [
      "endTime",
      "startTime"
    ]
  }
}

which is to say, total garbage - it's an array of lots of different kinds of objects.

ryangriggs commented 5 months ago

Odd, I don't have that feature on my device yet... it still takes me directly to Takeout to download the location history. I guess they haven't rolled out the update to my account yet.

Do you have any links describing the schema of the new export format? A cursory search yielded nothing useful. I don't have much time today to look into this.

dmd commented 5 months ago

Also, just emailed you.

dmd commented 5 months ago

I dug into this a bit more. It's not as bad as all that. There are four types of objects - visit, activity, timelineMemory, and timelinePath. As best as I can tell given my own data - and I have continuous data since 2010, so it should be pretty representative - the first three of those can be ignored if all you're interested in is a list of places you've been.

I wrote this:

import json
import csv
import argparse
from datetime import datetime, timedelta

def parse_iso8601(timestamp):
    return datetime.fromisoformat(timestamp.replace("Z", "+00:00"))

def extract_lat_lon(location):
    return location.split(":")[1].split(",")

def main(input_file, output_file):
    with open(input_file, "r") as f:
        data = json.load(f)

    with open(output_file, "w", newline="") as csvfile:
        fieldnames = ["timestamp", "latitude", "longitude"]
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()

        for item in data:
            if "timelinePath" in item:
                path_points = item["timelinePath"]
                start_time = parse_iso8601(item["startTime"])
                for point in path_points:
                    offset = timedelta(
                        minutes=int(point["durationMinutesOffsetFromStartTime"])
                    )
                    timestamp = (start_time + offset).isoformat()
                    lat, lon = extract_lat_lon(point["point"])
                    writer.writerow(
                        {"timestamp": timestamp, "latitude": lat, "longitude": lon}
                    )

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Convert JSON to CSV.")
    parser.add_argument("input_file", help="Input JSON file")
    parser.add_argument("output_file", help="Output CSV file")
    args = parser.parse_args()

    main(args.input_file, args.output_file)

which appears to correctly parse the data into something that e.g. kepler.gl can read. Basically, for each object in the top level JSON array, process it only if it is a timelinePath object (it contains that key). Get the startTime, then for each item in the timelinePath array, take its latitude and longitude, and pair it with the startTime + the durationMinutesOffsetFromStartTime.

I tried doing it that way vs. also processing visits and activities and ended up with almost identical results, so probably this is the way to go. (Note that if you do go the way of also processing visits and activities, you must ignore visits that have a semanticType of "Searched Address".)

timelineMemory can always be ignored.

dmd commented 5 months ago

I should also say that, unfortunately, the exported data from the app is incomplete. It's pretty good but it's only about 20% of the data points. For example:

image

versus

image
ryangriggs commented 5 months ago

Hi Dr. Drucker, thanks for this info. Are you interested in adding a PR to support the new location data format? Not sure what you mean by exported data is incomplete. Do you mean my app doesn't export all the locations that you import from the file?

Thanks for your input on this issue.


Ryan Griggs Hilltop Computing www.hilltop.net 859-328-3223 Toll Free: 1 (888) 5-HILLTOP (888-544-5586)

On Wed, Jun 12, 2024 at 10:39 AM Daniel M. Drucker, Ph.D. < @.***> wrote:

I should also say that, unfortunately, the exported data from the app is incomplete. It's pretty good but it's only about 20% of the data points. For example: image.png (view on web) https://github.com/ryangriggs/GoogleTimelineMapper/assets/41439/b5573670-d076-4d3a-a4b7-c7048521ea51

versus image.png (view on web) https://github.com/ryangriggs/GoogleTimelineMapper/assets/41439/8a720527-e9fc-4a86-84b2-96d4dcfdee15

— Reply to this email directly, view it on GitHub https://github.com/ryangriggs/GoogleTimelineMapper/issues/8#issuecomment-2163188635, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMVKGLVUKRFMFLHKSO4QMDZHBMR7AVCNFSM6AAAAABJGIH2ZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTGE4DQNRTGU . You are receiving this because you commented.Message ID: @.***>

dmd commented 5 months ago

Yeah, gimme a day and I'll PR.

No, that's a complaint about Google, not you. The export from the iOS google maps app doesn't contain the same amount of detail as the old Google Takeout data.

dmd commented 5 months ago

Actually, I don't know how you'd want this to work. Your app wants arrived, departed, duration, lat, lon. Using the timelinePath objects (which appear to contain the bulk of the data) we only really get a timestamp, lat, and lon.

MueJosh commented 5 months ago

Actually, I don't know how you'd want this to work. Your app wants arrived, departed, duration, lat, lon. Using the timelinePath objects (which appear to contain the bulk of the data) we only really get a timestamp, lat, and lon.

I can have a look at it in a few days, for now I'm working on a different feature (import GPX and kml and some other stuff).

assc1967 commented 4 months ago

Hi to all, I was trying to use your script for a project but like some other people found before the json downloaded from the device is probably different from the schema your script seems to be expecting Maybe someone could give me some hint about how to find the problem and get the script working back? :) Thank you

MueJosh commented 4 months ago

Hi to all,

I was trying to use your script for a project but like some other people found before the json downloaded from the device is probably different from the schema your script seems to be expecting

Maybe someone could give me some hint about how to find the problem and get the script working back? :)

Thank you

I downloaded my google timeline again and it still looks exactly the same as before...

As long as I don't have any files to test the programme, I can't change anything in the code. If someone can give me a source as an example (real data, not just the schema), I can try to adapt the code accordingly.

image image

ve3 commented 4 months ago

I can export from Google takeout. My time line is no longer work on browser on Google website. It is working on mobile app only. When I exported time line from takeout and select import from this repo's index file, none of them are working. There is the same error on OP. Timeline Edits.json => Invalid data: must contain 'timelineObjects' array. Tombstones.csv => Invalid data. The data must be in JSON format. Encrypted Backups.txt => Invalid data. The data must be in JSON format. Settings.json => Invalid data: must contain 'timelineObjects' array.

ryangriggs commented 4 months ago

We need some sample data in order to modify the parser. Can anyone supply a sample export? (no private data please!)

ve3 commented 4 months ago

We need some sample data in order to modify the parser. Can anyone supply a sample export? (no private data please!)

{
  "timelineEdits": [{
    "deviceId": "987654321",
    "rawSignal": {
      "signal": {
        "activityRecord": {
          "detectedActivities": [{
            "activityType": "STILL",
            "probability": 1.0
          }],
          "timestamp": "2024-07-06T06:53:29.710Z"
        }
      }
    }
  }, {
    "deviceId": "987654321",
    "rawSignal": {
      "signal": {
        "position": {
          "point": {
            "latE7": 137628680,
            "lngE7": 1006454630
          },
          "accuracyMm": 14063,
          "altitudeMeters": -23.0,
          "source": "WIFI",
          "timestamp": "2024-07-06T05:23:01.340Z",
          "speedMetersPerSecond": 0.0
        }
      },
      "additionalTimestamp": "2024-07-06T05:23:03.532Z"
    }
  }, {
    "deviceId": "987654321",
    "rawSignal": {
      "signal": {
        "wifiScan": {
          "deliveryTime": "2024-07-06T05:23:01.340Z",
          "devices": [{
            "mac": "123456789012345",
            "rawRssi": -43
          }, {
            "mac": "123456789012345",
            "rawRssi": -56
          }, {
            "mac": "123456789012345",
            "rawRssi": -62
          }, {
            "mac": "123456789012345",
            "rawRssi": -65
          }, {
            "mac": "123456789012345",
            "rawRssi": -72
          }, {
            "mac": "123456789012345",
            "rawRssi": -88
          }, {
            "mac": "123456789012345",
            "rawRssi": -92
          }]
        }
      },
      "additionalTimestamp": "2024-07-06T05:23:01.340Z"
    }
  }
  , ....and a lot more
  ]
}

lat is 13.nnn, long is 100.nnn wifiScan.devices.mac are all fake.

Sorry but file size is too big and I can't see the bottom that what it contains.

ve3 commented 4 months ago

Update: I see that new Google time line use new way to export and here is instruction.

However, I'm currently stuck with their export failed 1, 2.

nshores commented 3 months ago

Hi, I'm working a new similar app and may just do a PR for this one. I also am in need of a larger set of the new sample data that uses the new JSON schema, can somebody post a larger sample? Specifically, I'm looking for an example of the visit object. I have 10+ years of data and don't feel like pulling the trigger on my account yet to migrate.

EDIT: - Nevermind - I was given the data by another user. It's quite unfortunate to see how much data is missing in this new format. For example, The visit object contains only the placeid now (Which you can use the Googles Places API to query, to get all the extended info, at a tiny cost).

    "visit" : {
      "probability" : "0.786423",
      "topCandidate" : {
        "probability" : "0.000000",
        "semanticType" : "Unknown",
        "placeID" : "ChIJVdvEAMPHuFQRFXez7RC5rVI",
        "placeLocation" : "geo:44.038332,-121.337457"
      },
      "hierarchyLevel" : "0",
      "isTimelessVisit" : "false"
    }

When previously, it contained a wealth of information:

    "placeVisit": {
      "location": {
        "latitudeE7": 385734472,
        "longitudeE7": -1214795518,
        "placeId": "ChIJwWBVNMPQmoAREfJ3a-Yk9UE",
        "address": "1217 21st Street, Sacramento, CA 95811, USA",
        "name": "Kupros Craft House",
        "semanticType": "TYPE_UNKNOWN",
        "sourceInfo": {
          "deviceTag": -168220190
        },
        "locationConfidence": 93.986595,
        "calibratedProbability": 93.986595
      },
      "duration": {
        "startTimestamp": "2024-07-02T01:11:46Z",
        "endTimestamp": "2024-07-02T02:14:28Z"
      },
      "placeConfidence": "HIGH_CONFIDENCE",
      "visitConfidence": 86,

It's very frustrating because the address and name properties were very useful for what I am building (A searchable place index with categories, etc) Now I have to do an API call to get that information and cache it locally.

@dmd

dmd commented 3 months ago

@nshores sent to your email

nshores commented 3 months ago

Thank you!

Rahmanawan99 commented 2 months ago

Have we still found a way to get old timeline data back? or solution to Export failed?

shugo-chara commented 2 months ago

#8 @ryangriggs I'm also experiencing issues with data exported from the new timeline version not being converted to other formats. Can you please tell me if this project will continue to be able to support the new version of json (Timeline Edits) for conversion.

dpesu commented 1 month ago

@ryangriggs

Do you have any updates about how to use your web app with the new json schema?

Thank you in advance

ryangriggs commented 1 month ago

@dpesu Apologies, but unfortunately I'm busy on other projects and don't have time to modify this one right now to support the new format. Is anyone else interested to participate?

Hamlet87 commented 6 days ago

Any luck on this?

ve3 commented 6 days ago

I've created my project about browsing Google Maps timeline on my PC here. I hope you like it.

Hamlet87 commented 5 days ago

I've created my project about browsing Google Maps timeline on my PC here. I hope you like it.

Thanks so much but I have a new scheme json file I need into GPX or KML and I don't think your project does that.