jongough / ocpn_draw_pi

OpenCPN general drawing plug in
8 stars 17 forks source link

Data points missing #488

Closed sngsc6 closed 2 years ago

sngsc6 commented 2 years ago

jongough file.zip

The oldest file loads correctly; the youngest not.

e.g. the object 'Willem Loressluis' is included in the youngest file, but will not show at all (and lots of others), while in the oldest file, it's not a problem at all.

Hope you can find the reason.

Regards.

jongough commented 2 years ago

What encoding are you using? Currently OD is defaulting to UTF-8 but your files seem to contain characters that are not in this set.

sngsc6 commented 2 years ago

I’ve got no clue; the data has been retrieved from the Dutch government, and has been retrieved directly into MS-Access. In here the data is processed to get it into the right shape.

I dumped the data (500 records of 875 in total) into a txt-file that will be attached. Hope you can get the encoding out of it. Encoding test.txt

sngsc6 commented 2 years ago

Sorry, I did push the wrong button, but did reopen the issue.

jongough commented 2 years ago

In the file '01-Bedieningstijden - 2021-04-27.gpx' there is at least one duplicate GUID 'operatingtimes-0035198-428f-8c20-c0e103b90000' (lines 5034 & 5069). The GUID is supposed to be unique and will cause issues if it is not.

jongough commented 2 years ago

I have found the following duplicate GUIDs in the 2021-04-27 gpx file

<opencpn:guid>operatingtimes-0012444-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0024895-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0028223-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0031663-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0035198-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0036360-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0049032-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0050750-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0050750-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-0050750-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-12785011-428f-8c20-c0e103b90000</opencpn:guid>
<opencpn:guid>operatingtimes-7069838-428f-8c20-c0e103b90000</opencpn:guid>
jongough commented 2 years ago

I have done a test by making the GUID's unique on the duplicated items and the system appears to work OK,

sngsc6 commented 2 years ago

Hello jongough,

Thanks for evaluating the files; I will correct the double GUID coding.

But in the meantime, I went on as well, and you directed me in the issue of UTF-8 encoding. Opening the file in Notepad, saving it as an UTF-8 encoded file, and reloading it into OCPN-Draw, did solve the problem. The original file was having an 'ANSI' encoding.

Could it be, that your default 'file-save-encoding' is UTF-8, and therefor the updated file was working correctly?

The reason for asking, is that by taking out the double GUID's in the Access database and then creating the gpx file again, did not solve the issue; however converting that file to UTF-8 again, did solve it.

I'll have to see for a solution, to write the data from Access to file, directly in UTF-8 encoding, instead of ANSI.

Thanks for helping me out.

Kind regards

jongough commented 2 years ago

The default coding of XML files for OD is the same as for OCPN, UTF-8. This should handle all the national characters, but as you say, I think yours was something else. However, without the duplicate GUIDS it does seem to work OK, just you would have to check that the spelling is correct and understandable as some characters 'may' have been changed.

Even if the encoding is put in the xml header it will not help if the character set is not UTF-8 as the encoding parameter is only informational. I am afraid that your program will need to create the file in UTF-8 (I think this is a fairly standard encoding and MS products should handle this) or the process is going to have issues. http://www.differencebetween.net/technology/protocols-formats/difference-between-ansi-and-utf-8/

sngsc6 commented 2 years ago

Hi jongough,

I did manage to change the outputformat to utf-8 encoding, and took out the double GUID's. Now it seems that the older file (2021-04-27) did had the same problem as well, due to the fact, that it was also ANSI-encoded. But the bridges not shown, were not (yet) in the area we are sailing.

Thanks again for helping me out.