microsoft / RoadDetections

Road detections from Microsoft Maps aerial imagery
Other
537 stars 26 forks source link

tsv file issue #6

Closed sukuchha closed 3 months ago

sukuchha commented 1 year ago

Its not a valid geojson file so i cannot open it any GIS software.

Any simpler way to convert downloaded tsv file into valid geojson file ?

alasdairrae commented 1 year ago

Thanks sukuchha - I noticed this too. I only downloaded the Europe and Carribean files but they were both tsv rather than geojson.

alasdairrae commented 1 year ago

It just requires a bit more processing but maxolasersquad replied with information in another comment

jeffcsauer commented 1 year ago

see: https://gist.github.com/johnwbryant/06b504e2cfb4044c5216a1627ccc6180

alasdairrae commented 1 year ago

Thanks Jeff 👍👍

On Fri, 30 Dec 2022, 17:47 Jeff Sauer, @.***> wrote:

see: https://gist.github.com/johnwbryant/06b504e2cfb4044c5216a1627ccc6180

— Reply to this email directly, view it on GitHub https://github.com/microsoft/RoadDetections/issues/6#issuecomment-1368029780, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5FKDR47EP3FCEU7LD7SKLWP4N2HANCNFSM6AAAAAATLDJ4UY . You are receiving this because you commented.Message ID: @.***>

anisotropi4 commented 1 year ago

For what it's worth I've written some code that splits the region files into GeoPackage (gpkg) country files here.

I've also had a go at hosting some of these on GitHub here under the transport-network directory but I'm likely to have to get rid and try and find somewhere else to host them as I'm 1296% over my Large File Storage limit...

mkmdivy commented 1 year ago

Is there a reason offering data in a format of tsv which need processing to import in GIS software? Even capacity wise, neither geojson/tsv is a good idea since it is not compressed at all.. Wouldn't shp format more ideal for easy use?

anisotropi4 commented 1 year ago

@mkmdivy FWIW I would suggest GeoPKG over SHP as it holds all data in a single file, unlike SHP

Equally, if the files were in CSV/WKT with a header of key,WKT then a standard GIS libraries like GDAL would accept this as input. For example:

$ cat test.csv
key,WKT
GBR,"LINESTRING (-1.388108 53.194664, -1.386799 53.195011, -1.386917 53.195332, -1.38711 53.195686)"
GBR,"LINESTRING (-1.131978 52.654494, -1.131592 52.654181, -1.131538 52.654149)"
GBR,"LINESTRING (-2.118162 53.399298, -2.117958 53.398754, -2.118044 53.398454)"
GBR,"LINESTRING (-1.821392 53.804871, -1.820995 53.805213)"
GBR,"LINESTRING (-0.435398 53.786043, -0.436192 53.785916, -0.435977 53.785466, -0.435151 53.785548)"

Which is a valid GIS file

$ ogrinfo CSV:test.csv -al -so
NFO: Open of `CSV:test.csv'
  using driver `CSV' successful.
Layer name: test
Geometry: Unknown (any)
Feature Count: 5
Extent: (-2.118162, 52.654149) - (-0.435151, 53.805213)
Layer SRS WKT:
(unknown)
key: String (0.0)
WKT: String (0.0)

To convert the key/WKT file to whatever format required you can then use ogr2ogr for example GeoJSON

$ ogr2ogr -f GeoJSON test.json CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326
$ cat test.json
{"type": "FeatureCollection", "name": "GBR", "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.388108, 53.194664 ], [ -1.386799, 53.195011 ], [ -1.386917, 53.195332 ], [ -1.38711, 53.195686 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.131978, 52.654494 ], [ -1.131592, 52.654181 ], [ -1.131538, 52.654149 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -2.118162, 53.399298 ], [ -2.117958, 53.398754 ], [ -2.118044, 53.398454 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.821392, 53.804871 ], [ -1.820995, 53.805213 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -0.435398, 53.786043 ], [ -0.436192, 53.785916 ], [ -0.435977, 53.785466 ], [ -0.435151, 53.785548 ] ] } }
]}

Or similarly for GeoPackage

 $ ogr2ogr -f GPKG test.gpkg CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326
krassakis commented 1 year ago

@mkmdivy FWIW I would suggest GeoPKG over SHP as it holds all data in a single file, unlike SHP

Equally, if the files were in CSV/WKT with a header of key,WKT then a standard GIS libraries like GDAL would accept this as input. For example:

$ cat test.csv
key,WKT
GBR,"LINESTRING (-1.388108 53.194664, -1.386799 53.195011, -1.386917 53.195332, -1.38711 53.195686)"
GBR,"LINESTRING (-1.131978 52.654494, -1.131592 52.654181, -1.131538 52.654149)"
GBR,"LINESTRING (-2.118162 53.399298, -2.117958 53.398754, -2.118044 53.398454)"
GBR,"LINESTRING (-1.821392 53.804871, -1.820995 53.805213)"
GBR,"LINESTRING (-0.435398 53.786043, -0.436192 53.785916, -0.435977 53.785466, -0.435151 53.785548)"

Which is a valid GIS file

$ ogrinfo CSV:test.csv -al -so
NFO: Open of `CSV:test.csv'
  using driver `CSV' successful.
Layer name: test
Geometry: Unknown (any)
Feature Count: 5
Extent: (-2.118162, 52.654149) - (-0.435151, 53.805213)
Layer SRS WKT:
(unknown)
key: String (0.0)
WKT: String (0.0)

To convert the key/WKT file to whatever format required you can then use ogr2ogr for example GeoJSON

$ ogr2ogr -f GeoJSON test.json CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326
$ cat test.json
{"type": "FeatureCollection", "name": "GBR", "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.388108, 53.194664 ], [ -1.386799, 53.195011 ], [ -1.386917, 53.195332 ], [ -1.38711, 53.195686 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.131978, 52.654494 ], [ -1.131592, 52.654181 ], [ -1.131538, 52.654149 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -2.118162, 53.399298 ], [ -2.117958, 53.398754 ], [ -2.118044, 53.398454 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.821392, 53.804871 ], [ -1.820995, 53.805213 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -0.435398, 53.786043 ], [ -0.436192, 53.785916 ], [ -0.435977, 53.785466 ], [ -0.435151, 53.785548 ] ] } }
]}

Or similarly for GeoPackage ?

 $ ogr2ogr -f GPKG test.gpkg CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326

dear anisotropi, can you suggest a code package to run the commands in windows powershell?

anisotropi4 commented 1 year ago

@krassakis as I use a Debian based version of Linux I don't know really know what to suggest. A quick internet search for "install ogr2ogr on windows" gives this here. The ogrinfo and ogr2ogr software is part of the gdal library (My current Road Detection project is here)

krassakis commented 1 year ago

Thank you very much.

I have the following problem, have I typed correctly input and output path ?

Thank you in advnace~

[cid:5304cac7-993f-480a-8df0-449ca38793f1]

Με φιλικούς χαιρετισμούς,

Παύλος Κρασάκης

MSc Geologist - GIS Specialist Secretary of the R.S.S.A.C. (G.S.G.)

Tel: +30- 6947849806

URL: www.gistraining.grhttp://www.gistraining.gr http://etde.space.noa.gr/


From: Will Deakin @.> Sent: Monday, January 2, 2023 9:16 PM To: microsoft/RoadDetections @.> Cc: krassakis @.>; Mention @.> Subject: Re: [microsoft/RoadDetections] tsv file issue (Issue #6)

@krassakishttps://github.com/krassakis as I use a Debian based version of Linux I don't know really know what to suggest. A quick internet search for "install ogr2ogr on windows" gives this @.***/ogr2ogr-quick-start-guide-ef3f5fe6f595>. The ogrinfo and ogr2ogr software is part of the gdal libraryhttps://gdal.org/index.html (My current Road Detection project is herehttps://github.com/anisotropi4/robin)

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/RoadDetections/issues/6#issuecomment-1369158951, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGMXQM5CPMW7TJ76YTSSY7DWQMSQRANCNFSM6AAAAAATLDJ4UY. You are receiving this because you were mentioned.Message ID: @.***>

ramcandrews commented 1 year ago

Here is another tool to work with this data: https://github.com/rabenojha/microsoft-road-data

anisotropi4 commented 1 year ago

@ramcandrews I have raised an issue with the rabenojha tool as this worked for with Caribbean Islands but crashed with a memory error on the Europe-Full dataset

ramcandrews commented 1 year ago

this worked for with Caribbean Islands but crashed with a memory error on the Europe-Full dataset

I shouldn't recommend things before I try them. Sorry!

krassakis commented 1 year ago

For what it's worth I've written some code that splits the region files into GeoPackage (gpkg) country files here.

I've also had a go at hosting some of these on GitHub here under the transport-network directory but I'm likely to have to get rid and try and find somewhere else to host them as I'm 1296% over my Large File Storage limit...

image

It is easy to download GRC.zip ? it ask me to first download an Git LFS exe.

anisotropi4 commented 1 year ago

@krassakis I was told I had exceeded my GitHub LFS allocation by 1296% last week so I suspect the files have been deleted. Although I am looking at alternative hosting but that won't be quick, if you would like a copy it might be quicker to contact me via social media and I'll see what I can do

MissingRoadsDiscoveryMicrosoft commented 3 months ago

Data's format is a TSV file (tab separated values) with 2 columns - CountryCode (which is an alpha-3 code for the country where that geojson is) and a GeoJson linestring.

There is just simply too much data to store it in one file of geojson format. One simple way to create a combined geojson is to do something like this (all can be done manually in a notepad++):

{"type":"FeatureCollection","features":[ linestring1 , linestring2 , ... linestringN ]}

Here is an example I made for Cayman Islands with the above method with visualization from geojson.io: image

Otherwise, data from this repo can be easily converted into any other format with simple python or any other programming language. (You can ask your favorite generative AI to write a script for you). CountryCode can be used to filter out a single country to try out the data and not get an out of memory problems