theodi / orpi-corpus

0 stars 0 forks source link

Mistypes in the station names in the ORR data #7

Open giacecco opened 10 years ago

giacecco commented 10 years ago

Unexpected, e.g. Handborough is in reality Hanborough. Heysham Harbour is actually called Heysham Port, etc. Should we change these manually if they aren't too many? Full list to follow...

giacecco commented 10 years ago

Lookup fails in the 5% of cases: 128 out of 2,526 stations listed in the ORR report.

.../orpi-corpus$ foreman run node main.js --out data/corpus-new.csv 
..............................................................................................................
Lat/lon resolution failed for Stewartby...............................................
Lat/lon resolution failed for Bedford St.Johns
Lat/lon resolution failed for Bedford Midland......................................................................................................................................................
Lat/lon resolution failed for Mossley Hill...........................
Lat/lon resolution failed for Bank Hall
Lat/lon resolution failed for Bootle Oriel Road....
Lat/lon resolution failed for Liverpool Lime Street.........
Lat/lon resolution failed for Old Roan................................
Lat/lon resolution failed for St.Helens Junction.........
Lat/lon resolution failed for Wigan North Western.............
Lat/lon resolution failed for Wigan Wallgate.....
Lat/lon resolution failed for Hope (Clwyd).........................................
Lat/lon resolution failed for Hall I' Th' Wood.....
Lat/lon resolution failed for Hag Fold.
Lat/lon resolution failed for Bolton
Lat/lon resolution failed for Brinnington..
Lat/lon resolution failed for Moses Gate..........
Lat/lon resolution failed for Littleborough................
Lat/lon resolution failed for Heysham Harbour.........................
Lat/lon resolution failed for Hazel Grove.
Lat/lon resolution failed for Davenport
Lat/lon resolution failed for Stockport.
Lat/lon resolution failed for Woodsmoor..
Lat/lon resolution failed for Bredbury.
Lat/lon resolution failed for Belle Vue
Lat/lon resolution failed for Flowery Field
Lat/lon resolution failed for Newton For Hyde
Lat/lon resolution failed for Ashton-Under-Lyne.....
Lat/lon resolution failed for Altrincham........
Lat/lon resolution failed for Hyde Central....
Lat/lon resolution failed for Hale....
Lat/lon resolution failed for Reddish South
Lat/lon resolution failed for Heald Green
Lat/lon resolution failed for Heaton Chapel
Lat/lon resolution failed for Levenshulme
Lat/lon resolution failed for Mauldeth Road.
Lat/lon resolution failed for Cheadle Hulme.
Lat/lon resolution failed for Rose Hill (Marple)...........
Lat/lon resolution failed for Rochdale.
Lat/lon resolution failed for Humphrey Park
Lat/lon resolution failed for Urmston.
Lat/lon resolution failed for Navigation Road
Lat/lon resolution failed for Ashburys
Lat/lon resolution failed for Bramhall
Lat/lon resolution failed for Burnage.
Lat/lon resolution failed for Chassen Road.
Lat/lon resolution failed for East Didsbury..
Lat/lon resolution failed for Gatley..
Lat/lon resolution failed for Ardwick
Lat/lon resolution failed for Manchester Airport
Lat/lon resolution failed for Gorton
Lat/lon resolution failed for Deansgate
Lat/lon resolution failed for Manchester Oxford Road
Lat/lon resolution failed for Manchester Piccadilly
Lat/lon resolution failed for Trafford Park
Lat/lon resolution failed for Manchester Victoria
Lat/lon resolution failed for Middlewood
Lat/lon resolution failed for Moston.
Lat/lon resolution failed for Ryder Brow
Lat/lon resolution failed for Reddish North.....
Lat/lon resolution failed for Handborough......................................................................
Lat/lon resolution failed for Clifton Down......
Lat/lon resolution failed for Lawrence Hill.
Lat/lon resolution failed for Bristol Temple Meads....
Lat/lon resolution failed for Parson Street.
Lat/lon resolution failed for Stapleton Road
Lat/lon resolution failed for St.Andrew's Road
Lat/lon resolution failed for Sea Mills................................
Lat/lon resolution failed for Bodmin Parkway
Lat/lon resolution failed for Bugle
Lat/lon resolution failed for Camborne
Lat/lon resolution failed for Falmouth Docks.
Lat/lon resolution failed for Lostwithiel
Lat/lon resolution failed for Luxulyan
Lat/lon resolution failed for Newquay
Lat/lon resolution failed for Par
Lat/lon resolution failed for Penryn.
Lat/lon resolution failed for Perranwell
Lat/lon resolution failed for Redruth
Lat/lon resolution failed for Quintrel Downs
Lat/lon resolution failed for Roche
Lat/lon resolution failed for St.Austell
Lat/lon resolution failed for St.Columb Road...
Lat/lon resolution failed for Truro.....
Lat/lon resolution failed for Calstock
Lat/lon resolution failed for Gunnislake
Lat/lon resolution failed for Causeland..
Lat/lon resolution failed for St.Keyne
Lat/lon resolution failed for Liskeard
Lat/lon resolution failed for Looe
Lat/lon resolution failed for Menheniot..
Lat/lon resolution failed for Sandplace
Lat/lon resolution failed for St.Germans
Lat/lon resolution failed for Saltash................................
Lat/lon resolution failed for Ynyswen.......
Lat/lon resolution failed for Pontyclun.
Lat/lon resolution failed for Llwynypia...
Lat/lon resolution failed for Mountain Ash.
Lat/lon resolution failed for Fernhill..
Lat/lon resolution failed for Ystrad Rhondda...
Lat/lon resolution failed for Taffs Well
Lat/lon resolution failed for Tonypandy
Lat/lon resolution failed for Trefforest Estate
Lat/lon resolution failed for Trefforest
Lat/lon resolution failed for Trehafod
Lat/lon resolution failed for Treherbert
Lat/lon resolution failed for Treorchy.
Lat/lon resolution failed for Whitchurch (South Glamorgan)
Lat/lon resolution failed for Ton Pentre..
Lat/lon resolution failed for Dinas (Mid Glamorgan)
Lat/lon resolution failed for Pontypridd.........
Lat/lon resolution failed for Cwmbach
Lat/lon resolution failed for Aberdare
Lat/lon resolution failed for Penrhiwceiber.....................
Lat/lon resolution failed for Johnston (Dyfed)........................................................................
Lat/lon resolution failed for Dovey Junction...................................................................................................................................................................................................................................
Lat/lon resolution failed for Shepherd's Well.......................................................................................................................................
Lat/lon resolution failed for Sutton (Surrey).........................................................................................................
Lat/lon resolution failed for Ryde Pier Head....................................................................................................................................................
Lat/lon resolution failed for Portsmouth Arms...........................................................................................................................................................................
Lat/lon resolution failed for Whitwell (Derbys)....................................................................................................................................................................................................................................................
Lat/lon resolution failed for Tees-Side Airport............................................................................................................
Lat/lon resolution failed for Bentley (S. Yorks).................................................................
Lat/lon resolution failed for Wavertree Technology Park....................................................
Lat/lon resolution failed for Falls Of Cruachan.....................
Lat/lon resolution failed for Rannoch For Kinloch Rannoch...............................................................................................
Lat/lon resolution failed for Prestwick Internat'nl Airport..................................................................................................................................................
Lat/lon resolution failed for Llanharan........................................
Completed. Success rate 95%.
giacecco commented 10 years ago

Added debug feature to script, e.g. you can do:

.../orpi-corpus$ foreman run node main.js --nominatim "Stewartby railway station"
err is null
latlon is {"lat":"52.0694881","lon":"-0.5204876"}

... and that shows that the stations failing actually resolve to good latlon! Hence this must be a sw bug or an issue in managing Nominatim's throttling. What is really odd is that it is always the same stations that fail.

G.

giacecco commented 10 years ago

Current output:

.../orpi-corpus$ foreman run node main.js --out data/corpus-new.csv 
...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Lat/lon resolution failed for Heysham Harbour.......................................................................................................................
Lat/lon resolution failed for Handborough......................................................................................................................................
Lat/lon resolution failed for Quintrel Downs.........................................................................................
Lat/lon resolution failed for Whitchurch (South Glamorgan)......................................
Lat/lon resolution failed for Johnston (Dyfed)...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Lat/lon resolution failed for Whitwell (Derbys).................................................................................................................................................................................................................................................................................................................................................................
Lat/lon resolution failed for Bentley (S. Yorks)............................................................................................................................................
Lat/lon resolution failed for Rannoch For Kinloch Rannoch...............................................................................................
Lat/lon resolution failed for Prestwick Internat'nl Airport...........................................................................................................................................................................................
Completed. Success rate 100%.
Data written to data/corpus-new.csv.
statzhero commented 10 years ago

When / how do we input the missing stations manually?

giacecco commented 10 years ago

I had started a file with the fixes here https://github.com/theodi/orpi-corpus/blob/master/data/ORR-station-names-corrections-INCOMPLETE.csv . You may add to it when you find issues. I would add to this file only the minimum possible information with the fixes. I'll write the code to apply the fixes sometime soon.