datasets / un-locode

United Nations Codes for Trade and Transport Locations (UN/LOCODE) and Country Codes
https://datahub.io/core/un-locode
142 stars 55 forks source link

[add][l] Adding updated scripts #29

Open gradedSystem opened 4 days ago

gradedSystem commented 4 days ago

Changes made:

cc @anuveyatsu

sabas commented 3 days ago

@gradedSystem @anuveyatsu this change rewrites the process I made instead of improving it, so I am against currently. The mdb processing is needed because, as it is written in the Readme, the csv has encoding problems, see this in the pull request

AZ,QAB,Q?b?l?,Rayon

cc @cristan

gradedSystem commented 3 days ago

@sabas is it only for the subdivion-codes.csv and code-list.csv?

gradedSystem commented 3 days ago

@sabas Oh now I actually see what you mean so it is better to use .mdb file to retrieve the information right?

sabas commented 3 days ago

@gradedSystem when I did the first release I only processed mdb as I wanted to solve the encoding of accented letters, last year it was reported that the csv version had an additional column so I added the processing of that one as well. Perhaps there's a mdbtools version for python to use?

I wish to fix it upstream, but it's still a long way to go :)

gradedSystem commented 3 days ago

@sabas current .mdb from https://unece.org/trade/cefact/UNLOCODE-Download files do not contain these files:

mv: mdb_CountryCodes_out.csv: No such file or directory
mv: mdb_FunctionClassifiers_out.csv: No such file or directory
mv: mdb_StatusIndicators_out.csv: No such file or directory

it only contains these files:

Screenshot 2024-10-07 at 15 02 41
gradedSystem commented 3 days ago

@gradedSystem when I did the first release I only processed mdb as I wanted to solve the encoding of accented letters, last year it was reported that the csv version had an additional column so I added the processing of that one as well. Perhaps there's a mdbtools version for python to use?

I wish to fix it upstream, but it's still a long way to go :)

Regarding this there is a mdb-tools which can be used pretty easily I think: https://pypi.org/project/mdb-parser/

gradedSystem commented 3 days ago

@sabas current .mdb from https://unece.org/trade/cefact/UNLOCODE-Download files do not contain these files:

mv: mdb_CountryCodes_out.csv: No such file or directory
mv: mdb_FunctionClassifiers_out.csv: No such file or directory
mv: mdb_StatusIndicators_out.csv: No such file or directory

it only contains these files: Screenshot 2024-10-07 at 15 02 41

Ignore this one I have realized when you open the Code-list there is 5 files that are present as tables

anuveyatsu commented 2 days ago

hi @sabas thanks for reviewing this PR! @gradedSystem have you updated the scripts as necessary? Just want to get this resolved and approved by @sabas