EticaAI / lexicographi-sine-finibus

Lexicographī sine fīnibus
The Unlicense
0 stars 0 forks source link

Organization strategy to deal with numeric namespace of dictionaries which are handled at administrative level #39

Open fititnt opened 2 years ago

fititnt commented 2 years ago

Context

Currently, we compile dictionaries which are mostly intended for international use. The only ones which are country/territory level are one of the early versions, the https://github.com/EticaAI/multilingual-lexicography/issues/2, which we did apply any additional data export beyond basic HXL (and, anyway, they are based on outdated version, mostly intended for testing the way to organize over 100 countries).

Potential approaches (need more testing)

One potential approach could be we, for every sub level after the entrypoint [1603], we reserve a number (likely based on UN m49) to always be reserved for dictionaries related to administration of the upper level.

However, if we take this approach, we need to think about what to do if a region of an upper administrative region decides to publish dictionaries. Even if we (at least not at global level) publish such dictionaries, we may want to at least reserve a second namespace intended to to require two intermediate codes (the first one being the UN m49 entrypoint, and the second one being UN P49, but without country prefix). So every country would require at least two numbers always reserved at second 1603 level.

Use case

Notes:

Brazil

Let's use Brazil as use case. One potential approach (which

Under this logic


Edit

fititnt commented 2 years ago

Update: testing give an entire namespace prefix side by side instead of create semantics very deep. Using [1679] (instead of [1603] prefix for this

TSV dump of potential candidates with already P attributes on Wikidata (obviously incomplete)

ID  label   description aliases Data type   Count
P1585   Brazilian municipality code identifier for municipalities in Brazil IBGE code   ExternalId  5,570
P3216   ClassInd rating rating of an audiovisual work, video game or RPG in the Brazilian Advisory Rating System (ClassInd) BARS    WikibaseItem    2,063
P4060   Brazilian Olympic Committee athlete ID  identifier for a Brazilian athlete at the Brazilian Olympic Committee (Portuguese: Comitê Olímpico do Brasil) website   BOC athlete ID, COB athlete ID, Comitê Olímpico do Brasil athlete ID    ExternalId  757
P4251   TSE number  number assigned by the Brazilian Superior Electoral Court to registered political parties       ExternalId  36
P4344   QEdu ID identifier for a Brazilian educational institution, in the QEdu database        ExternalId  46
P4351   Cravo Albin artist ID   identifier for an artist or group, in the Cravo Albin Dictionary of Brazilian Popular Music Cravo Albin, Cravo Albin ID ExternalId  104
P4360   Monumentos de São Paulo ID  identifier for a monument in São Paulo city, Brazil, on the Monumentos de São Paulo website     ExternalId  481
P4372   iPatrimônio ID  identifer for a Brazilian historical Heritage asset, in the Infopatrimônio database Infopatrimônio ID   ExternalId  5,102
P4393   Anvisa drug ID  identifier provided by Anvisa, a regulatory body of the Brazilian government responsible for the regulation and approval of pharmaceutical drugs, sanitary standards and regulation of the food industry        ExternalId  0
P4399   Itaú Cultural ID    unique identifier for a entity in the Itaú Cultural Encyclopedia website    Enciclopédia Itaú Cultural de Arte e Cultura Brasileiras ID, Enciclopédia Itaú Cultural ID, itaucultural ID ExternalId  5,850
P4400   Memória Globo ID    identifier for pages on the history of Brazilian TV network Rede Globo, researched by a team of journalists, historians and anthropologists     ExternalId  100
P4401   Museus Brazil ID    identifier for information on Brazilian museums from museus.cultura.gov.br (Museusbr)       ExternalId  2,077
P4597   FAPESP institution ID   identifier for institutions funded by the Brazilian research education and innovation foundation, FAPESP        ExternalId  47
P4598   FAPESP researcher ID    identifier for researchers funded by the Brazilian research education and innovation foundation, FAPESP     ExternalId  649
P4619   National Library of Brazil ID   identifier for an element in the database of the National Library of Brazil BLBNB ID, NLB ID    ExternalId  12,465
P4660   CPDOC ID    identifier for a bibliographic record in the Center for Research and Documentation of Contemporary History of Brazil (CPDOC)    Centro de Pesquisa e Documentação de História Contemporânea do Brasil ID    ExternalId  546
P4664   Wiki Aves bird ID   identifier for a bird species on Wiki Aves is a Brazilian catalogue website     ExternalId  2,266
P4721   MuBE Virtual ID identifier for a sculpture in Brazil, in the MuBE database      ExternalId  116
P5148   e-MEC entry entry for a Brazilian institution of higher learning in the Ministry of Education   eMec    ExternalId  125
P5525   CONDEPHAAT ID   Conselho de Defesa do Patrimônio Histórico identifier for monuments in São Paulo, Brazil    Council for the Defense of Historical, Archaeological, Artistic and Tourist Heritage ID ExternalId  1,586
P5527   Academia Brasileira de Letras ID    identifier for a member on the Academia Brasileira de Letras website        ExternalId  295
P5528   Belgian Heritage in Brazil ID   identifier for notorious individuals, companies and artworks associated to the Belgian heritage in Brazil       ExternalId  114
P5549   INEPAC ID   ID for cultural heritage in Rio de Janeiro, Brazil      ExternalId  530
P5892   UOL Eleições ID іdentifier for elections in Brazil containing voting data for each position per State   UOL Eleições identifier ExternalId  480
P6004   Brasiliana Iconográfica ID  identifier for an artwork, in the "Brasiliana Infográfica" database     ExternalId  2,570
P6204   CNPJ    identification number issued to Brazilian companies by the Secretariat of the Federal Revenue of Brazil     ExternalId  131
P6468   ISA ID  identifier for Brazilian indigenous populations from Instituto Socioambiental   Instituto Socioambiental ID, ISA ID ExternalId  13
P6555   Brazilian Electoral Unit ID unique identifier of an brazilian electoral unit, defined by the Brazilian Superior Electoral Court     ExternalId  5,570
P6630   SNISB ID    unique identifier of a Brazilian dam, defined by the Brazilian National Information System on Dams Safety (SNISB)       ExternalId  22
P6671   French public service directory ID  identifier for French public services   French public service directory service-public.fr ID, service-public.fr ID  ExternalId  35,297
P6672   Placar UOL Eleições ID  identifier to results of Brazilian municipal and state elections in the Placar UOL Eleições database        ExternalId  347
P6673   Memórias da Ditadura ID identifier for people who were killed or went missing during the Brazilian military dictatorship (1964-1985)        ExternalId  417
P6674   Desaparecidos Políticos ID  identifier for people who were killed or went missing during the Brazilian military dictatorship (1964-1985)        ExternalId  379
P6690   CNV-SP ID   identifier for people who were killed or went missing during the Brazilian military dictatorship (1964-1985) in the São Paulo State compiled by the São Paulo State Truth Commission    Comissão da Verdade do Estado de São Paulo ID   ExternalId  168
P6692   CEMDP ID    identifier for people who were killed or went missing during the Brazilian military dictatorship (1964-1985)        ExternalId  344
P6937   SNBP ID identifier of the Brazilian Sistema Nacional de Bibliotecas Públicas        ExternalId  23
P7266   Guia dos Quadrinhos comic ID (Brazilian)    identifier for a Brazilian comic book or graphic novel      ExternalId  12
P7267   Guia dos Quadrinhos publishing house ID (Brazilian) identifier for a Brazilian comic book publishing house      ExternalId  16
P7268   Guia dos Quadrinhos character ID    identifier for a comic book character       ExternalId  24
P7269   Guia dos Quadrinhos comic ID    identifier for a non-Brazilian comic book or graphic novel that was published in Brazil     ExternalId  8
P7270   Guia dos Quadrinhos publishing house ID identifier for a non-Brazilian comic book publishing house that has its comics published in Brazil      ExternalId  11
P7480   Brazilian federal deputy ID identifier for a member of the Chamber of Deputies of Brazil        ExternalId  3,898
P7946   Museu de Memes ID   identifier for the Memes Museum of the Brazilian Fluminense Federal University      ExternalId  2
P8114   Wikiparques ID  identifier for a Brazilian protected area on Wikiparques        ExternalId  6
P8514   TOPCMB ID   unique identifier for a entity Tesauro de Objetos do Patrimônio Cultural nos Museus Brasileiros website     ExternalId  3,405
P8812   IMMuB artist ID identifier for an artist in the Brazilian Musical Memory Institute database Instituto Memória Musical Brasileira artist ID  ExternalId  17
P8813   IMMuB album ID  identifier for an album in the Brazilian Musical Memory Institute database  Instituto Memória Musical Brasileira album ID   ExternalId  42
P8853   Musica Brasilis composer ID identifier for a composer on the Musica Brasilis website    Musica Brasilis ExternalId  305
P9064   Povos Indígenas no Brasil ID    identifier for an indigenous group in the reference work Povos Indígenas no Brasil  Encyclopedia of Indigenous Peoples in Brazil ID ExternalId  2
P9119   LexML Brazil ID identifier for laws in the LexML system     ExternalId  27,718
P9116   Musica Brasilis score ID    unique identifier for a score in the Musica Brasilis website        ExternalId  113
P9354   Porcelana Brasil ID identifier for a faience or porcelain manufacturer on the Porcelana Brasil website  Porcelana Brasil identifier, PorcelanaBrasil ID, PorcelanaBrasil identifier ExternalId  3
P9421   IFVPF ID    identifier for a Brazilian coffee farm in the Inventory of the Vale do Paraíba Fluminense Fazendas  IFCVPF ID, Inventário das Fazendas de Café do Vale do Paraíba Fluminense ID, Inventário das Fazendas do Vale do Paraíba Fluminense ID, Inventory of the Vale do Paraíba Fluminense Fazendas ID  ExternalId  4
P9451   Dicionário Histórico-Biográfico Brasileiro ID   identifier for a person in the Dicionário Histórico-Biográfico Brasileiro       ExternalId  0
P10701  Reflora ID  identifier for a taxon on the Reflora Flora e Funga do Brasil website   Reflora identifier, Flora and Fungi of Brazil ID, Flora and Fungi of Brazil identifier  ExternalId  23

Edit: added links