Urban-Analytics-Technology-Platform / acbm

activity-based modelling pipeline (for transport demand models)
https://hackmd.io/w-m_OKaDT3GGBfSqFPpBjA
Apache License 2.0
4 stars 1 forks source link

Should we add osmox to the repo? #19

Open Hussein-Mahfouz opened 5 months ago

Hussein-Mahfouz commented 5 months ago

I have tried osmox for mapping activity purposes to osm POIs and it works very well. I am currently cloning the osmox repo to a different directory on my machine, downloading the osm data through geofabrik, processing the data, and returning the output file back to the acbm repo for analysis.

Should osmox be available inside the acbm repo? If so, what is the best way to install it? To use it you have to clone the repo, create a virtual environment, and run a command line tool (see instructions here). @sgreenbury what do you think?

The pipeline could be run using a python script for different areas:

sgreenbury commented 5 months ago

I think your script pipeline sounds like a good option since it is designed to be used as a CLI. You could install in a separate venv and run that directly (e.g. it will be in path .venv/bin/osmox) or it should now work adding as a dependency to acbm too (I've updated the uatk-spc pyrarrow dependency so they should be compatible):

poetry add git+https://github.com/arup-group/osmox

It can then be called from the same python venv as the acbm one with (e.g. the example in the docs):

import subprocess
subprocess.run("osmox run configs/example.json example/isle-of-man-latest.osm.pbf example/isle-of-man -f geopackage -crs epsg:27700 -l".split(" "))

We could write a python function with a cache (to avoid rerunning) as part of the acbm module too so that it can be called directly if that is more convenient - will it need to be run once or many times for a given area?

Hussein-Mahfouz commented 5 months ago

The choice of separate venv vs adding directly to acbm is worth discussing, as I'm facing a similar question with pam, ref #18

We could write a python function with a cache (to avoid rerunning) as part of the acbm module too so that it can be called directly if that is more convenient - will it need to be run once or many times for a given area?

I like this idea. It should only be run once for every area, but we could have an update option in the function if we want to rerun (e.g. if we have changed the config file)

Hussein-Mahfouz commented 5 months ago

another option for downloading osm data: pydriosm

sgreenbury commented 5 months ago

Also for retail floor space, there is geolytix data used in QUANT_RAMP.