Closed PureNatural closed 1 year ago
I have modified the countries of two existing companies.
1) jina_ai(China -> Germany) Because the website shows the headquarters of the company is located in Berlin.
2) openresty(China -> United States) Yichun Zhang is the president and CEO of OpenResty Inc. I find that OpenResty Inc. is located in the United States.
Great work, actually the founders of JINA AI and OpenResty Inc. are both Chinese, but the companies are really registered abroad, it is my mistake.
And since we adjust country names in this commit, I think we should decide which standard we use to name the countries and how to set the file names.
I suggest we follow ISO 3166-1 which is a standard for country codes and names.
In this standard, there is no South Korea
but Korea, Republic of
to represent the South Korea, so we may use korea_republic_of.yml
as the file name. And for all the names field in the YAML files, we should use exactly the short English name of each country. If so, we can also add this to documentation so developers can understand how this works.
WDYT?
I agree with @frank-zsy . Using ISO 3166-1 alpha-2 codes as file names seems to be a good choice.
The International Standard for country codes and codes for their subdivisions
The purpose of ISO 3166 is to define internationally recognized codes of letters and/or numbers that we can use when we refer to countries and their subdivisions. However, it does not define the names of countries – this information comes from United Nations sources (Terminology Bulletin Country Names and the Country and Region Codes for Statistical Use maintained by the United Nations Statistics Divisions).
Using codes saves time and avoids errors as instead of using a country’s name (which will change depending on the language being used), we can use a combination of letters and/or numbers that are understood all over the world. ...
The above content is excerpted from https://www.iso.org/iso-3166-country-codes.html.
Since the names of countries(or other types of regional division) may cause unknow issues, using codes of letters as filenames can eliminate many additional responsibilities. Maintaining fields in files(or even a ISO 3166 mapping table version) is easier than file names, and we can focus more on labelling than naming and disputes of regions.
In addition, the Country of Origin
fileds in dbdb.io(e.g. https://www.dbdb.io/db/mariadb) also use the ISO 3166-1 alpha-2. I collected the features here. It is convenient to label and filter the multi-labelled records with alpha-2 codes.
I agree with @birdflyi. We should focus more on labeling than naming and disputes of regions.
And for all the names fields in the YAML files, we should use exactly the short English name of each country.
If someone wants to know the name of a country, they can get it in the YAML file.
I think the result is quite good now, is this PR ready to merge? @PureNatural
I think the result is quite good now, is this PR ready to merge? @PureNatural
Yes, it can be merged.
/approve
But still this PR may lead other change like cron task which depends on the region label data.
There are 137 companies in this dir.
I have added country information to these companies by querying Wikipedia or the company website.
This data may support the big screen.@zhicheng-ning
I have changed "America" to "United_States". Because in most map APIs America is named as the United States such as Bing map. So I think this can facilitate map visualization work. The first letter of each word in a country also needs to be capitalized.