Open n8henrie opened 3 years ago
This PR allows my import to succeed.
It looks like some events don't have an id
, but do have HKExternalUUID
(which gets turned into metadata_HKExternalUUID
), so I use this as a fallback.
If a record has neither of these, I changed it to just print the record (for debugging) and return
.
For some odd reason this ran fine at first, and now (after removing the generated db and trying again) I'm getting a different error (duplicate column name).
Looks like it may have run when I had two successive runs without remembering to delete the db in between. Will try to refactor.
Ok, new PR works. I'm not git
enough so I just force-pushed over the old one.
I still end up with a lot of activities that are missing an id
and therefore skipped (since this is used as the primary key). For example:
{'workoutActivityType': 'HKWorkoutActivityTypeRunning', 'duration': '35.31666666666667', 'durationUnit': 'min', 'totalDistance': '4.010870267636999', 'totalDistanceUnit': 'mi', 'totalEnergyBurned': '660.3516235351562', 'totalEnergyBurnedUnit': 'Cal', 'sourceName': 'Strava', 'sourceVersion': '22810', 'creationDate': '2020-07-16 13:38:26 -0700', 'startDate': '2020-07-16 06:38:26 -0700', 'endDate': '2020-07-16 07:13:45 -0700'}
I also end up with some unhappy characters (in the skipped events), such as: 'sourceName': 'Nathan’s Apple\xa0Watch',
.
But it's successfully making it through the file, and the resulting db opens in datasette, so I'd call that progress.
I have this issue, too, with a fresh export. None of my Workout
entries in export.xml
have an id
key, though the sample export.xml
in the tests folder doesn’t either, so I don’t think this is the culprit. Indeed, it seems @simonw is using the hash_id
function from sqlite_utils
, which creates a column (id
, in this case) based on a hash of the row’s contents.
When I run the script, a workouts
table is created, with one entry: my first workout. No workout_points
table is created, as I’d expect from utils.py
. I then get essentially the same error as noted in this thread:
Traceback (most recent call last):
File "/Users/lchski/.pyenv/versions/3.10.3/bin/healthkit-to-sqlite", line 8, in <module>
sys.exit(cli())
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/cli.py", line 57, in cli
convert_xml_to_sqlite(fp, db, progress_callback=bar.update, zipfile=zf)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/utils.py", line 34, in convert_xml_to_sqlite
workout_to_db(el, db, zipfile)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/healthkit_to_sqlite/utils.py", line 57, in workout_to_db
pk = db["workouts"].insert(record, alter=True, hash_id="id").last_pk
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py", line 2822, in insert
return self.insert_all(
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py", line 2950, in insert_all
self.insert_chunk(
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py", line 2715, in insert_chunk
result = self.db.execute(query, params)
File "/Users/lchski/.pyenv/versions/3.10.3/lib/python3.10/site-packages/sqlite_utils/db.py", line 458, in execute
return self.conn.execute(sql, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: workouts.id
Are there maybe duplicate workouts in the data, which’d cause multiple rows to share the same id
? It’s strange, though, that no workout_points
is created at all. Export created from iOS 15.3.1.
Update: this appears to be because of running the command twice without clearing the DB in between. Tries to insert a Workout that already exists, causing a collision on the (auto-generated) id
column. Had a different error with a clean DB, likely due to the workout points format; will make a new issue for that.
@simonw any chance to get this fixed soon?
I'm getting an error on my initial attempt to import data: