Closed fabiosangregorio closed 1 year ago
Hey @fabiosangregorio !
It looks at lot like #4, which was caused by malformed XML. I still have no clue why this is happening, and I've never been able to reproduce it on my side, despite a lot of exports..
As a first step if you could load the file in a XML parser to verify its validity ? I'll keep the issue open, keep me posted 😄 !
Yep, looks like it!
Here are the culprit lines according to the screenshot:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE HealthData [
<!-- HealthKit Export Version: 12 -->
<!ELEMENT HealthData (ExportDate,Me,(Record|Correlation|Workout|ActivitySummary|ClinicalRecord|VisionPrescription)*)>
<!ATTLIST HealthData
locale CDATA #REQUIRED
>
<!ELEMENT ExportDate EMPTY>
<!ATTLIST ExportDate
value CDATA #REQUIRED
>
<!ELEMENT Me EMPTY>
<!ATTLIST Me
HKCharacteristicTypeIdentifierDateOfBirth CDATA #REQUIRED
HKCharacteristicTypeIdentifierBiologicalSex CDATA #REQUIRED
HKCharacteristicTypeIdentifierBloodType CDATA #REQUIRED
HKCharacteristicTypeIdentifierFitzpatrickSkinType CDATA #REQUIRED
>
<!ELEMENT Record ((MetadataEntry|HeartRateVariabilityMetadataList)*)>
<!ATTLIST Record
type CDATA #REQUIRED
unit CDATA #IMPLIED
value CDATA #IMPLIED
sourceName CDATA #REQUIRED
sourceVersion CDATA #IMPLIED
device CDATA #IMPLIED
creationDate CDATA #IMPLIED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
>
<!-- Note: Any Records that appear as children of a correlation also appear as top-level records in this document. -->
<!ELEMENT Correlation ((MetadataEntry|Record)*)>
<!ATTLIST Correlation
type CDATA #REQUIRED
sourceName CDATA #REQUIRED
sourceVersion CDATA #IMPLIED
device CDATA #IMPLIED
creationDate CDATA #IMPLIED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
>
<!ELEMENT Workout ((MetadataEntry|WorkoutEvent|WorkoutRoute)*)>
<!ATTLIST Workout
workoutActivityType CDATA #REQUIRED
duration CDATA #IMPLIED
durationUnit CDATA #IMPLIED
totalDistance CDATA #IMPLIED
totalDistanceUnit CDATA #IMPLIED
totalEnergyBurned CDATA #IMPLIED
totalEnergyBurnedUnit CDATA #IMPLIED
sourceName CDATA #REQUIRED
sourceVersion CDATA #IMPLIED
device CDATA #IMPLIED
creationDate CDATA #IMPLIED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
>
<!ELEMENT WorkoutActivity EMPTY>
<!ATTLIST WorkoutActivity
uuid CDATA #REQUIRED
startDate CDATA #REQUIRED
endDate CDATA #IMPLIED
duration CDATA #IMPLIED
durationUnit CDATA #IMPLIED
>
<!ELEMENT WorkoutEvent EMPTY>
<!ATTLIST WorkoutEvent
type CDATA #REQUIRED
date CDATA #REQUIRED
duration CDATA #IMPLIED
durationUnit CDATA #IMPLIED
>
<!ELEMENT WorkoutStatistics EMPTY>
<!ATTLIST WorkoutStatistics
type CDATA #REQUIRED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
average CDATA #IMPLIED
minimum CDATA #IMPLIED
maximum CDATA #IMPLIED
sum CDATA #IMPLIED
>
<!ELEMENT WorkoutRoute ((MetadataEntry|FileReference)*)>
<!ATTLIST WorkoutRoute
sourceName CDATA #REQUIRED
sourceVersion CDATA #IMPLIED
device CDATA #IMPLIED
creationDate CDATA #IMPLIED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
>
<!ELEMENT FileReference EMPTY>
<!ATTLIST FileReference
path CDATA #REQUIRED
>
<!ELEMENT ActivitySummary EMPTY>
<!ATTLIST ActivitySummary
dateComponents CDATA #IMPLIED
activeEnergyBurned CDATA #IMPLIED
activeEnergyBurnedGoal CDATA #IMPLIED
activeEnergyBurnedUnit CDATA #IMPLIED
appleMoveTime CDATA #IMPLIED
appleMoveTimeGoal CDATA #IMPLIED
appleExerciseTime CDATA #IMPLIED
appleExerciseTimeGoal CDATA #IMPLIED
appleStandHours CDATA #IMPLIED
appleStandHoursGoal CDATA #IMPLIED
>
<!ELEMENT MetadataEntry EMPTY>
<!ATTLIST MetadataEntry
key CDATA #REQUIRED
value CDATA #REQUIRED
>
<!-- Note: Heart Rate Variability records captured by Apple Watch may include an associated list of instantaneous beats-per-minute readings. -->
<!ELEMENT HeartRateVariabilityMetadataList (InstantaneousBeatsPerMinute*)>
<!ELEMENT InstantaneousBeatsPerMinute EMPTY>
<!ATTLIST InstantaneousBeatsPerMinute
bpm CDATA #REQUIRED
time CDATA #REQUIRED
>
<!ELEMENT ClinicalRecord EMPTY>
<!ATTLIST ClinicalRecord
type CDATA #REQUIRED
identifier CDATA #REQUIRED
sourceName CDATA #REQUIRED
sourceURL CDATA #REQUIRED
fhirVersion CDATA #REQUIRED
receivedDate CDATA #REQUIRED
resourceFilePath CDATA #REQUIRED
>
<!ELEMENT Audiogram EMPTY>
<!ATTLIST Audiogram
type CDATA #REQUIRED
sourceName CDATA #REQUIRED
sourceVersion CDATA #IMPLIED
device CDATA #IMPLIED
creationDate CDATA #IMPLIED
startDate CDATA #REQUIRED
endDate CDATA #REQUIRED
>
<!ELEMENT SensitivityPoint EMPTY>
<!ATTLIST SensitivityPoint
frequencyValue CDATA #REQUIRED
frequencyUnit CDATA #REQUIRED
leftEarValue CDATA #IMPLIED
leftEarUnit CDATA #IMPLIED
rightEarValue CDATA #IMPLIED
rightEarUnit CDATA #IMPLIED
>
<!ELEMENT VisionPrescription EMPTY>
<!ATTLIST VisionPrescription
type CDATA #REQUIRED
dateIssued CDATA #REQUIRED
expirationDate CDATA #REQUIRED
brand CDATA #IMPLIED
<!ELEMENT RightEye EMPTY>
<!ATTLIST RightEye
sphere CDATA #IMPLIED
sphereUnit CDATA #IMPLIED
cylinder CDATA #IMPLIED
cylinderUnit CDATA #IMPLIED
axis CDATA #IMPLIED
axisUnit CDATA #IMPLIED
add CDATA #IMPLIED
addUnit CDATA #IMPLIED
vertex CDATA #IMPLIED
vertexUnit CDATA #IMPLIED
prismAmount CDATA #IMPLIED
prismAmountUnit CDATA #IMPLIED
prismAngle CDATA #IMPLIED
prismAngleUnit CDATA #IMPLIED
farPD CDATA #IMPLIED
farPDUnit CDATA #IMPLIED
nearPD CDATA #IMPLIED
nearPDUnit CDATA #IMPLIED
baseCurve CDATA #IMPLIED
baseCurveUnit CDATA #IMPLIED
diameter CDATA #IMPLIED
diameterUnit CDATA #IMPLIED
>
<!ELEMENT LeftEye EMPTY>
<!ATTLIST LeftEye
sphere CDATA #IMPLIED
sphereUnit CDATA #IMPLIED
cylinder CDATA #IMPLIED
cylinderUnit CDATA #IMPLIED
axis CDATA #IMPLIED
axisUnit CDATA #IMPLIED
add CDATA #IMPLIED
addUnit CDATA #IMPLIED
vertex CDATA #IMPLIED
vertexUnit CDATA #IMPLIED
prismAmount CDATA #IMPLIED
prismAmountUnit CDATA #IMPLIED
prismAngle CDATA #IMPLIED
prismAngleUnit CDATA #IMPLIED
farPD CDATA #IMPLIED
farPDUnit CDATA #IMPLIED
nearPD CDATA #IMPLIED
nearPDUnit CDATA #IMPLIED
baseCurve CDATA #IMPLIED
baseCurveUnit CDATA #IMPLIED
diameter CDATA #IMPLIED
diameterUnit CDATA #IMPLIED
>
device CDATA #IMPLIED
<!ELEMENT MetadataEntry EMPTY>
<!ATTLIST MetadataEntry
key CDATA #IMPLIED
value CDATA #IMPLIED
>
>
]>
the last few lines look weird
device CDATA #IMPLIED
<!ELEMENT MetadataEntry EMPTY>
<!ATTLIST MetadataEntry
key CDATA #IMPLIED
value CDATA #IMPLIED
>
>
]>
The only explanation is that the XML exporting process on the iPhone is borked and sometimes doesn't produce a valid XML file. Unfortunately there is nothing i can do to prevent it.
I'll check if the xml
module allows reading invalid files, because from the snippet you sent, the error is located before the health records. If the module can skip this and start reading the next section directly, it might workaround this issue.
I'll keep the issue open until I have more infos on the matter. Thx a lot for reporting 👍 !
Hey @fabiosangregorio !
I released v0.0.5, which should handle any malformed XML like yours. I've done some tests emulating your file, but could you test in on your side too before closing the issue ?
Just do a quick docker-compose pull
to grab the latest image (or change the tag to :v0.0.5
) before launching the ingester.
Thx for your help !
Hi @k0rventen! Now the ingester doesn't crash but I get the following output:
Opening Route 2021-10-19 4:43pm
apple-health-grafana-ingester-1 | Opening Route 2022-11-18 8:07am
apple-health-grafana-ingester-1 | Opening Route 2021-04-20 7:14pm
apple-health-grafana-ingester-1 | Export file is /export/apple_health_export/export.xml
apple-health-grafana-ingester-1 | Total number of records: 0
apple-health-grafana-ingester-1 | All done! You can now check grafana.
(and Grafana shows no data).
I also tried:
😢
I reproduced the same behaviour..
Depending on how malformed the XML is, sometimes lxml
is able to reconstruct and parse it properly, and sometimes it can't.
I guess the last resort is to discard the first section all together and start reading from the HealthData section onwards. I'll see what I can do to resolve this problem once and for all, will update this issue.
Looks like it's a known issue, I'll try to follow the steps there and see if it fixes the xml. If it does, maybe you could add a preprocessing step to the xml before using it in the ingestor 👀
I played around this morning, and by discarding the whole first section the XML appears valid and lxml
can parse it.
I've added a (not very clean but hey it might work) step doing that after unziping the file.
If you could test with your original export.xml that woud be great !
To grab the image with the fix, change the ingester image in the docker-compose.yml
file to:
ingester:
image: k0rventen/apple-health-grafana-ingester:rolling
Then docker-compose pull
to make sure you have the latest one before launching the ingester.
Please test it out and report back how it went, I'm hoping this will work :crossed_fingers:
Amazing, it works! 😍 Thanks a lot 🙏🏻
Awesome ! I'll cleanup a bit and make a proper release.
Thx for your help !
Hi! First of all, thanks for the project, that's amazing! ❤️
I'm running into a crash on the ingester when importing my health data. Please find the logs below:
Would you be able to take a look please?