Closed martin-ueding closed 5 months ago
I got a bit of log output which contains the first 100 bytes of the file. And these are the following:
>>> b = b'\x00\x05\x16\x07\x00\x02\x00\x00Mac OS X \x00\x02\x00\x00\x00\t\x00\x00\x002\x00\x00\x0e\xb0\x00\x00\x00\x02\x00\x00\x0e\xe2\x00\x00\x01\x1e\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ATTR\xff\xff\xef\x17\x00\x00\x0e\xe2\x00\x00\x00\x98'
We can then take a look and try to detect the character encoding:
>>> import chardet
>>> chardet.detect(b)
{'encoding': 'Windows-1252', 'confidence': 0.73, 'language': ''}
That means that this could be the Windows-1252
encoding. The user said that the files got converted many times. It could even mean that there are irrecoverable encoding errors and the data is garbled.
As these are GPX files, the data of interest will be in the ASCII section and therefore should be fine with almost any encoding. So perhaps that will work out even if there is not the perfect code page there.
Version 0.17.4 contains some experimental code with that.
I've let the program emit the first 1000 bytes into the log. And there we find the string com.apple.quarantine
. So we have some Apple specific feature active here. The interesting thing is that the file name is Activities/._route_2023-01-17_5.05pm.gpx
, so it seems to be some hidden file. I'm not sure what this means exactly. Is there a file Activities/route_2023-01-17_5.05pm.gpx
which can be read just fine? Or is that broken? I've asked the user to test a bit more.
As the quarantine files start with a period, we can just skip those. That should make it more robust.
A macOS user has trouble opening GPX files. They have sent me the file and I can open it on Linux. There is something weird going on. This is an example traceback:
In order to diagnose this further, I've added a bit more logging.