outputType : It is a single value field . It takes user's request about type of the data format they want in return , Currently GeoJSON is supported by API ( All testing is done with this output ) , KML and MBTILES are experimental as part of ogr2ogr : They use ogr2ogr logic for binding features .API is structured in such a way it can support all the data format that are supported by ogr2ogr itself but This PR doesn't include any of them as a full feature , Currently only KML and MBTILES can be requested to test ogr2ogr bindings ( MBTILES are restricted to 2 sq km for now since No any testing has been done to those experimental features ) , You can not define more than one outputtype in single request for now , They are defined as enum : value other than that will raise error . It is an optional field , if you don't include it in your request ,geojson will be default option
GEOJSON ="GeoJSON" # default ~ uses galaxy binding
KML = "KML" # experimental for now ~ uses ogr2ogr binding
MBTILES ="MBTILES" # experimental for now ~ uses ogr2ogr binding with MINZOOM=10 and MAXZOOM=22
geometry : Takes a polygon geojson geometry as input , supports single polygon at a time and can be passed like this
It will raise error if area of geometry is greater than 1500000 Sq KM ( 2 Sq km when mbtiles is selected on outputtype ) , It is the only compulsory field in API
osmTags : Multiple entries are supported and can be passed as item in dictionary .It takes key and value pair as python dictionary but value should be of list type otherwise it will raise error , You can pass any key value pair in this field , They will be used for filtering the features along with geometry
Conditions :
Only key can be supplied to query all type of values related to that key like this : {"key1":[]} i.e. value list should be empty , In other words this means key1=* to the API
Key and value pair can be supplied like this : {"key1": ["value"]}
Multiple values associated with that key can also be passed like this : {'key2":["value1","value2"]}
Remember value must be passed inside list even though it is a single value
No validation is done in existence of osm key and values , if there is a typo error or either key or value is not present you will get null
It is an optional field , if you don't supply this field you will be getting features without osm tag filtering within that polygon
columns : It's a list of values , multiple values are supported . Columns represent the attribute column you want to be included in your export. It is an optional field , List of keys can be passed like this : ["key1","key2"]. When you specify those key in list API will go search for values of that key in each feature and populate it as a different column in your export . for example if you want name of the features then you can pass like this ["name"] . if you don't supply anything, default you will be getting attributes pasted as img below (osm_id,timestamp,changeset,tags ) , Even though you define your own attribute list osm_id will be included automatically
Default output : osm_id, tags , changeset , timestamp
osmElements : It is an optional field, multiple values are supported as list. You can specify which type of osm element you want to build your query . It Supports all type of osm elements : nodes, ways and relations , Specially targeted for overpass users
geometryType : It's a list of values , multiple values are supported .You can filter type of the features you want based on it's geometry , Currently supported options are listed below , other than those will raise error , It is an optional field if not supplied you will be getting all type of geometry
Conditions when both geometrytype and osmelemts are supplied : If both geometry type and osm_elements are supplied then there will be validation in mapping for eg : you can not select ways in osm_elements and point in geometry type because ways will not have point features , following mapping is valid : points and nodes, linestring and ways , polygon and ways , all geom types in relation . This option will be used to specify exactly where you want to look , for eg : for highways you can select ways in osmelements and linestring in geometrytype , In that way exports will be more faster : Specifically targeted to build query for predefined set of fields like export tool to make exports faster
Remember : You can just pass the polygon to the api to get everything in that area
/raw-data/status/ ~ Gives db last_updated information
This endpoint will be used to check what is the latest status of rawdata database , It is a get request endpoint and provides last_updated time , after checking it with the database and substracting it with current time
Sample response :
{
"last_updated": "Less than a Minute ago"
}
This will be used to create this type of info in UI
/raw-data/exports/{file_name}
This endpoint will be used to download files from the server ! If file is not present it will return null ! This is the address which provided from raw-data/current_snapshot endpoint response , You don't need to supply .zip trail after filename because currently we only do zip binding and hence it will be automatically added to the filename , handled by API itself
Benchmarks :
This is just a initial load testing with few number of inputs on current snapshot with geojson output , Tested on Expensive query ~ Extracted everything without using filters
Data Available in RDS : Asia and Africa updating every minute from planet server
Load Testing tool used :LocustArea used for testing : "5446 Sq Km "
Geojson Size : ~ Approx. 400 MB
No of features in area processed : Approx. 1M rows
Tested on : Devserver with 4GB of ram and 30GB of space
We have managed to keep that request response time with average of Approx. 48 secs , When 4 users are sending request each second, tested with 69 requests ! When database server was free it came back in 31 sec and even with max out came back with max time of 59 sec , Postgresql cache may have played a role here when passing same polygon but we can generalize it
Download Full report here
report.pdf
Usually it takes 15-25 min with our exisitng tools for same area with same features !
How to test :
You can directly test this branch live on this Dev server and dummy UI , You can see more instruction on UI itself
Note : This PR has feature of OSM login authentication but disabled right now for testing due to which user information is not included in exports , PR requires ogr2ogr installed locally on the machine that hosts the product . Once PR gets merged and starts on production server we can reenable the authentication
182
181
Three endpoints :
/raw-data/current-snapshot/ ~ Takes user request
It has following parameters :
outputType : It is a single value field . It takes user's request about type of the data format they want in return , Currently GeoJSON is supported by API ( All testing is done with this output ) , KML and MBTILES are experimental as part of ogr2ogr : They use ogr2ogr logic for binding features .API is structured in such a way it can support all the data format that are supported by ogr2ogr itself but This PR doesn't include any of them as a full feature , Currently only KML and MBTILES can be requested to test ogr2ogr bindings ( MBTILES are restricted to 2 sq km for now since No any testing has been done to those experimental features ) , You can not define more than one outputtype in single request for now , They are defined as enum : value other than that will raise error . It is an optional field , if you don't include it in your request ,geojson will be default option
geometry : Takes a polygon geojson geometry as input , supports single polygon at a time and can be passed like this
It will raise error if area of geometry is greater than 1500000 Sq KM ( 2 Sq km when mbtiles is selected on outputtype ) , It is the only compulsory field in API
osmTags : Multiple entries are supported and can be passed as item in dictionary .It takes key and value pair as python dictionary but value should be of list type otherwise it will raise error , You can pass any key value pair in this field , They will be used for filtering the features along with geometry Conditions :
columns : It's a list of values , multiple values are supported . Columns represent the attribute column you want to be included in your export. It is an optional field , List of keys can be passed like this : ["key1","key2"]. When you specify those key in list API will go search for values of that key in each feature and populate it as a different column in your export . for example if you want name of the features then you can pass like this ["name"] . if you don't supply anything, default you will be getting attributes pasted as img below (osm_id,timestamp,changeset,tags ) , Even though you define your own attribute list osm_id will be included automatically Default output : osm_id, tags , changeset , timestamp
osmElements : It is an optional field, multiple values are supported as list. You can specify which type of osm element you want to build your query . It Supports all type of osm elements : nodes, ways and relations , Specially targeted for overpass users
geometryType : It's a list of values , multiple values are supported .You can filter type of the features you want based on it's geometry , Currently supported options are listed below , other than those will raise error , It is an optional field if not supplied you will be getting all type of geometry
Conditions when both geometrytype and osmelemts are supplied : If both geometry type and osm_elements are supplied then there will be validation in mapping for eg : you can not select ways in osm_elements and point in geometry type because ways will not have point features , following mapping is valid : points and nodes, linestring and ways , polygon and ways , all geom types in relation . This option will be used to specify exactly where you want to look , for eg : for highways you can select ways in osmelements and linestring in geometrytype , In that way exports will be more faster : Specifically targeted to build query for predefined set of fields like export tool to make exports faster
Example Response :
Remember : You can just pass the polygon to the api to get everything in that area
/raw-data/status/ ~ Gives db last_updated information
This endpoint will be used to check what is the latest status of rawdata database , It is a get request endpoint and provides last_updated time , after checking it with the database and substracting it with current time Sample response :
This will be used to create this type of info in UI
/raw-data/exports/{file_name}
This endpoint will be used to download files from the server ! If file is not present it will return null ! This is the address which provided from raw-data/current_snapshot endpoint response , You don't need to supply .zip trail after filename because currently we only do zip binding and hence it will be automatically added to the filename , handled by API itself
Benchmarks :
This is just a initial load testing with few number of inputs on current snapshot with geojson output , Tested on Expensive query ~ Extracted everything without using filters Data Available in RDS : Asia and Africa updating every minute from planet server Load Testing tool used : Locust Area used for testing : "5446 Sq Km " Geojson Size : ~ Approx. 400 MB No of features in area processed : Approx. 1M rows Tested on : Devserver with 4GB of ram and 30GB of space
We have managed to keep that request response time with average of Approx. 48 secs , When 4 users are sending request each second, tested with 69 requests ! When database server was free it came back in 31 sec and even with max out came back with max time of 59 sec , Postgresql cache may have played a role here when passing same polygon but we can generalize it Download Full report here report.pdf Usually it takes 15-25 min with our exisitng tools for same area with same features !
How to test : You can directly test this branch live on this Dev server and dummy UI , You can see more instruction on UI itself
Note : This PR has feature of OSM login authentication but disabled right now for testing due to which user information is not included in exports , PR requires ogr2ogr installed locally on the machine that hosts the product . Once PR gets merged and starts on production server we can reenable the authentication
cc : @LeenDhondt