joaoftrodrigues / atpdb-cleanup-and-conversion-to-relational

0 stars 0 forks source link

Validate Cities #2

Open joaoftrodrigues opened 1 year ago

joaoftrodrigues commented 1 year ago

Description

Check if cities are well written, and correct wrong ones.

Process

  1. Get cities' dataset
  2. Compare original values with database values (needs split of locations)
  3. Assign correct values to a new field 'city' | Check wrong values to correct

Dataset of cities

So far was found follow datasets:

Data import

mongoimport --db atp --collection cities --drop --file D:\MCD\1\BDDA\TP\data\cities.csv --type=csv --headerline

joaoftrodrigues commented 1 year ago

Macedonia F3

During country validation, was found Skopje and Skopja on North Macedonia, as the real name is Skopje. Although the name is close, the tournament was checked to verify if real location was Skopja and it was, so it was altered to it.

Query

db.atpplayers.updateMany({Tournament: "Macedonia F3"}, {$set: {City: "Skopje"}})

joaoftrodrigues commented 1 year ago

Macedonia F2

In this tournament was found an incoherence, where the city didn't correspond to the country.

Imagem WhatsApp 2022-12-12 às 01 33 06

After a search, was found the real city in this site, Skopje: https://www.itftennis.com/en/tournament/macedonia-f2-futures/mkd/2017/m-fu-mkd-02a-2017/draws-and-results/

Query

db.atpplayers.updateMany({Tournament: "Macedonia F2"}, {$set: {City: "Skopje"}})

joaoftrodrigues commented 1 year ago

Fix Tasmania

Tasmania must be putted as state.

image

Query

db.atpplayers.updateMany({Country: "Tasmania"}, {$set:{State: "Tasmania"}})

joaoftrodrigues commented 1 year ago

Fix Alberta

db.atpplayers.updateMany({Country: "Alberta"}, {$set: {State: "Alberta"}})

joaoftrodrigues commented 1 year ago

Fix Curaçao

Curaçao becomes State value db.atpplayers.updateMany({Country: "Curacao"}, {$set: {State: "Curaçao"}})

joaoftrodrigues commented 1 year ago

Fix Dutch Antil / Dutch Antils States

They're marked as country, so were putted as state.

db.atpplayers.updateMany({Country: {$in: ["Dutch Anti", "Dutch Antil"]}},{$set: {State: "Curaçao"}})

joaoftrodrigues commented 1 year ago

Fix Ramat, Hasharon Location

The location has the value "Ramat. Hasharon". Ramat Hasharon is a city from Israel.

Fixing the problem on city field:

db.atpplayers.updateMany({Country: "Hasharon"}, {$set: {City: "Ramat Hasharon"}})

joaoftrodrigues commented 1 year ago

Fix Yanggu County

Shown as "Yang Gu", real name is "Yanggu County"

db.atpplayers.updateMany({City: "Yang Gu"}, {$set: {City: "Yanggu County"}})

joaoftrodrigues commented 1 year ago

Fix Martinique

Martinique is an island, so it will count as state and removed from being a city. db.atpplayers.updateMany({City:"Martinique"}, {$set: {State: "Martinique", City: null}})

joaoftrodrigues commented 1 year ago

Fix Nouméa

db.atpplayers.updateMany({City:"Noumea"},{$set: {City: "Nouméa"}})

Also, for New Caledonia: db.atpplayers.updateMany({Country:"New Caledoni"},{$set: {State: "New Caledonia"}})

joaoftrodrigues commented 1 year ago

Fix US States as Country

OK - Oklahoma

OK is Oklahoma state, question is, store the state as "OK" or Oklahoma?

db.atpplayers.updateMany({Country:"OK"}, {$set: {State: "OK"}})

Texas

db.atpplayers.updateMany({Country:"Texas"}, {$set: {State: "Texas"}})

joaoftrodrigues commented 1 year ago

Ontaria -> State Ontario

db.atpplayers.updateMany({Country:"Ontaria"}, {$set: {State: "Ontario"}})

joaoftrodrigues commented 1 year ago

Reunion Island to State

db.atpplayers.updateMany({Country: "Reunion Island"}, {$set: {State: "Réunion", City: null}})

joaoftrodrigues commented 1 year ago

Nova Sad -> Novi Sad

db.atpplayers.updateMany({City: "Nova Sad"}, {$set: {City: "Novi Sad"}})

joaoftrodrigues commented 1 year ago

City names in country fields (exchanged)

's-Hertogenbosch

db.atpplayers.updateMany({Country: "'s-Hertogenbosch"}, {$set: {City: "'s-Hertogenbosch"}})

Abidjan

db.atpplayers.updateMany({Country: "Abidjan"}, {$set: {City: "Abidjan"}})

Angleur - Liege

db.atpplayers.updateMany({Country: "Angleur - Liege"}, {$set: {City: "Liège"}})

San Salvador

db.atpplayers.updateMany({Country: {$in: ["Salvador", "San Salvador"]}}, {$set: {City: "San Salvador"}})

Santiago

db.atpplayers.updateMany({Country: "Santiago"}, {$set: {City: "Santiago"}})

joaoftrodrigues commented 1 year ago

State name in country field

Bahia

db.atpplayers.updateMany({Country: "Bahia"}, {$set: {State: "Bahia", City: null}})

Lara

db.atpplayers.updateMany({City: "Lara"}, {$set: {State: "Lara", City: null}})

Sardinia

db.atpplayers.updateMany({Country: "Sardinia"}, {$set: {State: "Sardinia"}})

Victoria

db.atpplayers.updateMany({Country: "Victoria"}, {$set: {State: "Victoria"}})

joaoftrodrigues commented 1 year ago

Portorož

Change from Portoroz to Portorož ?

joaoftrodrigues commented 1 year ago

Tai-Chung -> Taichung

db.atpplayers.updateMany({City: "Tai-Chung"}, {$set: {City: "Taichung"}})

joaoftrodrigues commented 1 year ago

Port-of-Spain -> Port of Spain

db.atpplayers.updateMany({City: "Port-of-Spain"}, {$set: {City: "Port of Spain"}})

joaoftrodrigues commented 1 year ago

Uriage -> Saint-Martin-d'Uriage

db.atpplayers.updateMany({City: "Uriage"},{$set: {City: "Saint-Martin-d'Uriage"}})

joaoftrodrigues commented 1 year ago

City with country name

Botwana

db.atpplayers.updateMany({City: "Botwana"}, {$set: {City: null}})

Brazi

db.atpplayers.updateMany({City: "Brazi"}, {$set: {City: null}})

Domincan Republic

db.atpplayers.updateMany({City: "Domincan Republic"}, {$set: {City: null}})

Mexica

db.atpplayers.updateMany({City: "Mexica"}, {$set: {City: null}})

Phillipines

db.atpplayers.updateMany({City: "Phillipines"}, {$set: {City: null}})

United States Of America

db.atpplayers.updateMany({City: "United States Of America"}, {$set: {City: null}})

Venezeuela

db.atpplayers.updateMany({City: "Venezeuela"}, {$set: {City: null}})

joaoftrodrigues commented 1 year ago

City to State

Devon

db.atpplayers.updateMany({City: "Devon"}, {$set: {State: "Devon", City: null}})

Ciudad de Habana

db.atpplayers.updateMany({City: "Ciudad de Habana"}, {$set: {State: "Ciudad de Habana", City: null}})

joaoftrodrigues commented 1 year ago

Correct spelling

Eindhoven

db.atpplayers.updateMany({City:"Elndhoven"}, {$set: {City: "Eindhoven"}})

Florianópolis

db.atpplayers.updateMany({City:"Florianapolis"}, {$set: {City: "Florianópolis"}})

Neuchâtel

db.atpplayers.updateMany({City:"Neuchatel"}, {$set: {City: "Neuchâtel"}})

Portoro

db.atpplayers.updateMany({City:"Portoroz"}, {$set: {City: "Portorož"}})

Pörtschach am Wörthersee

db.atpplayers.updateMany({City: "Portschach"}, {$set: {City: "Pörtschach am Wörthersee"}})

Prešov <- Presov

Esch-sur-Alzette

db.atpplayers.updateMany({City: "Esch/Alzette"}, {$set: {City: "Esch-sur-Alzette"}})