pelias / csv-importer

Import arbitrary data in CSV format to Pelias
MIT License
24 stars 22 forks source link

Custom data are not augmented with wof data #86

Closed SilvrDuck closed 3 years ago

SilvrDuck commented 3 years ago

Hi, I am trying to import custom data that look like this:

id,number,street,lon,lat,city,postcode,region,layer,source,street_de,city_de,street_fr,city_fr,street_it,city_it,street_ro,city_ro
1-0,20,Grossholzerstrasse,47.26911713561371,8.44909571341747,Affoltern am Albis,8910,ZH,address,admin_ch,Grossholzerstrasse,Affoltern am Albis,,,,,,
1000001-0,2,Chemin des Avallons,46.270537085094865,6.22018143204421,Anières,1247,GE,address,admin_ch,,,Chemin des Avallons,Anières,,,,
...

These are based on openly available data from the swiss confederation. I am not using the openaddresses importer as these have been preprocessed, selected, and also support multi-language.

I am running into an issue though, if I import my data, a final record might look like that:

{
  'id': '2376396-0',
  'gid': 'admin_ch:address:2376396-0',
  'layer': 'address',
  'source': 'admin_ch',
  'source_id': '2376396-0',
  'housenumber': '2',
  'street': 'Chemin de Vers',
  'postalcode': '1228',
  'confidence': 1,
  'match_type': 'exact',
  'accuracy': 'point'
}

Whereas when I was using openaddresses, it use to have the whole WOF hierarchy:

{
  'id': 'ch/geneva:3f9a77e579e4b7b7',
  'gid': 'openaddresses:address:ch/geneva:3f9a77e579e4b7b7',
  'layer': 'address',
  'source': 'openaddresses',
  'source_id': 'ch/geneva:3f9a77e579e4b7b7',
  'name': 'Chemin De Vers 2',
  'housenumber': '2',
  'street': 'Chemin De Vers',
  'postalcode': '1228',
  'confidence': 1,
  'match_type': 'exact',
  'accuracy': 'point',
  'country': 'Switzerland',
  'country_gid': 'whosonfirst:country:85633051',
  'country_a': 'CHE',
  'region': 'Geneva',
  'region_gid': 'whosonfirst:region:85682291',
  'region_a': 'GE',
  'county': 'Genève',
  'county_gid': 'whosonfirst:county:102062917',
  'localadmin': 'Plan-les-Ouates',
  'localadmin_gid': 'whosonfirst:localadmin:404328619',
  'locality': 'Plan-les-Ouates',
  'locality_gid': 'whosonfirst:locality:1125887615',
  'neighbourhood': "La Queue D'arve",
  'neighbourhood_gid': 'whosonfirst:neighbourhood:85862377',
  'continent': 'Europe',
  'continent_gid': 'whosonfirst:continent:102191581',
  'label': 'Chemin De Vers 2, Plan-les-Ouates, Switzerland'
}

I checked what data openaddresses uses, but it just look like that:

LON,LAT,NUMBER,STREET,UNIT,CITY,DISTRICT,REGION,POSTCODE,ID,HASH
8.4490957,47.2691171,20,Grossholzerstrasse,,Affoltern am Albis,,ZH,8910,1-0,de1b19138df2ef57
6.2201814,46.2705371,2,Chemin des Avallons,,Anières,,GE,1247,1000001-0,713cbbb216c5de64
...

Which is very similar to what I have in my custom csv.

Looking into the code, I found that the wof-admin-lookup module was used. In the openaddresses importer, the code was slightly different, so I made my own csv-importer image and added the following lines to lib/importPipeline.js:

...
const adminLayers = ['neighbourhood', 'borough', 'locality', 'localadmin',
  'county', 'macrocounty', 'region', 'macroregion', 'dependency', 'country',
  'empire', 'continent'];
...
    .pipe(adminLookup.create(adminLayers))
...

As it is what is done in the openaddresses importer import pipeline.

But this didn't solve the issue. I event tried to import my custom data using the openaddresses-importer, but the records didn't have the wof infos.

What am I missing?

orangejulius commented 3 years ago

Hi @SilvrDuck,

I think you have the lon and lat columns (or data) reversed in your example CSV. That would lead to the records being far away from Switzerland (the horn of Africa, in this case 😄), where I assume you have no WOF data coverage loaded in your Pelias install.

We've all done it :)

SilvrDuck commented 3 years ago

Wow, you're right. Can't believe the time I spent on this 😅

Thanks a lot @orangejulius !