idem-lab / map-ir-pipeline

Prototype demonstration of stacked generalisation method used in https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000633#sec010
3 stars 1 forks source link

geocoding step is reeeaaal slow #70

Closed goldingn closed 7 months ago

goldingn commented 7 months ago

this code:

  moyes_geno_geocode = geocode_geno_data(moyes_geno_raw),
  moyes_geno_countries = extract_country(moyes_geno_geocode),

takes forever because it makes calls to an external service. It doesn't matter too much for that Moyes dataset, since it will only happen once and then be cached (and because it might soon be replaced with a VectorAtlas dataset that has countries), but if we want to include that functionality in the future, we should consider doing a spatial query against country shapefiles instead, as it will be much much quicker

njtierney commented 7 months ago

Yup totally agree - hadn't considered doing the query against shapefiles, will look into that!

njtierney commented 7 months ago

OK I've got this down to about 11 seconds now 😅 so that should be a bit of an improvement