CTHRU / Hitrava

Convert your Huawei Health sport activities and import them in Strava.
Other
352 stars 34 forks source link

Different data retrieved using HiTrava Web and Python version #63

Closed viktorgab closed 3 years ago

viktorgab commented 3 years ago

Hello!

First of all, thank you and congratulations on this amazing tool.

Regarding the issue, I was comparing the activity uploaded from the TCX files and I noticed that when the same activity comes from either the HiTrava Web or the Python tool, it can have different info. Please check the pictures attached.

From Python From Web

Any idea why this happens? Shouldn't it be exactly the same as the data source is the same?

Best regards, Victor.

CTHRU commented 3 years ago

Thanks for your kind words.

It's possible that you see minor differences. I'll try to explain below.

  1. The script version was originally written on the so called 'HiTrack' data files that were created on the phone when you looked at an activity in the Huawei Health app. The HiTrack file contains the raw data from your activity but without the general info like e.g. exact start/stop time of the activity available in the extended JSON data. The script version still supports this 'legacy' mode for users with rooted phones. However, to be able to support it, the script version contains some specific code for this legacy mode, whereas the web version doesn't.
  2. For the above reason, if you convert using the --file or --tar options, you probably will see a difference with the --json or --zip version.
  3. The web version shares the conversion logic with the script version, but it has been rewritten from the ground up in another programming language (since I specifically wanted the conversion to happen on the host (your) computer without the need to send your activity data to the server). The web version does not support the legacy mode so I was able to leave out any code that relates to that legacy mode. E.g. for a fact the raw HiTrack data may contain records before or after the start/stop from the JSON data, and for a fact these are never processed in the web version.
  4. I would have to look at it again to know for sure, but there might be a slight difference in how data is processed when GPS connection is lost. As far as I remember, it was easier to detect it in the way the web version processes the data (without the legacy code). Since the code needs to rely on speed data (roughly occurring every 5 seconds with a maximum accuracy of 0.1 dm/s), it might also cause some slight differences.
  5. Unfortunately I didn't release an update of the web version for a while with some latest corrections based on user feedback from the python version. The thing is, I was/am developing a version with direct upload to Strava (functional in development, for testing but not for end-user ready), but due to personal reasons, I was/am unable to finish it right now.

I hope this helps. Thanks for reporting this, i'll leave it open to have a look at it again, will first try to get that new web version released.

yihong0618 commented 3 years ago

I would have to look at it again to know for sure, but there might be a slight difference in how data is processed when GPS connection is lost. As far as I remember, it was easier to detect it in the way the web version processes the data (without the legacy code). Since the code needs to rely on speed data (roughly occurring every 5 seconds with a maximum accuracy of 0.1 dm/s), it might also cause some slight differences.

I have done a lot of running data related work in the past year -- running_page, I think this piece is the most likely cause,

btw, if you upload to strava, the data will be different from other software because strava calculates the height.

viktorgab commented 3 years ago

@CTHRU thank you for the fast reply and thorough explanation. So if I understood correctly, the Python version currently is the most accurate, right?

@yihong0618 thank you for the insights as well. From the second paragraph, I don't understand what you mean, as both cases were data extracted from Huawei Health through requesting data to Huawei, converted using either HiTrava Web/Python version, then finally uploaded to Strava.

CTHRU commented 3 years ago

To give you a heads-up, I checked for a possibility to pass the global Huawei numbers in the converted file, but it won't be that straight forward. Whenever a pause or GPS loss occurs during an activity (happens more often than expected), The intermediate track/distance/time has to be used in separate parts of the converted data. If I would just 'fix' the last part by adding a final distance/time record, it will probably result in messy data at the end of the activity. I think for now I have to leave it as is. Maybe a better alternative will come along the way.

CTHRU commented 3 years ago

I might have found a 'fix' to get the distances in sync. If you want, you can try out the script in the 5.1.0 development branch

Your feedback whether this would be a viable solution is welcome.