SpaceNetChallenge / utilities

Packages intended to assist in the preprocessing of SpaceNet satellite imagery data corpus to a format that is consumable by machine learning algorithms.
Other
248 stars 97 forks source link

Cannot convert geojson to PASCALVOS2012 format using createDataSpaceNet.py #117

Open SCoulY opened 5 years ago

SCoulY commented 5 years ago

Hi, I followed the instructions in README and ran the following command: python spacenetutilities/scripts/createDataSpaceNet.py /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train --srcImageryDirectory RGB-PanSharpen --outputDirectory /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations --annotationType PASCALVOC2012 --convertTo8Bit --imgSizePix 400

The traceback is like this: fullpathImageDirectory = /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen fullpathGeoJsonDirectory = /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/geojson/buildings [['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif', 'RGB-PanSharpen']] buildings | 0.00, 0.00, 121.61| | 0.00,-0.00, 31.42| | 0.00, 0.00, 1.00| Creating Chips: 0%| | 0/4 [00:00<?, ?it/s]Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating Chips: 50%|██████████████████████████████████████████████████████████ | 2/4 [00:00<00:00, 16.34it/s]Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating Chips: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 16.11it/s] ['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations/geojson/buildings/buildings__121.6147392_31.4137359.geojson'] Traceback (most recent call last): File "spacenetutilities/scripts/createDataSpaceNet.py", line 321, in <module> bboxResize= args.boundingBoxResize File "spacenetutilities/scripts/createDataSpaceNet.py", line 88, in processChipSummaryList bboxResize=bboxResize File "/home/yuankunhao/datasets/spacenet/utilities/spacenetutilities/labeltools/pascalVOCLabel.py", line 212, in geoJsonToPASCALVOC2012 borderValue=255 File "/home/yuankunhao/datasets/spacenet/utilities/spacenetutilities/labeltools/pascalVOCLabel.py", line 117, in geoJsonToPASCALVOC2012SegmentCls source_layer = gpd.read_file(geoJson) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/geopandas/io/file.py", line 71, in read_file with reader(path_or_bytes, **kwargs) as features: File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/env.py", line 397, in wrapper return f(*args, **kwargs) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/__init__.py", line 249, in open path = parse_path(fp) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/path.py", line 132, in parse_path elif path.startswith('/vsi'): AttributeError: 'list' object has no attribute 'startswith'

The printed line [['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif', 'RGB-PanSharpen']] is what i inspected in pascalVOCLabel.py 117 line, the geoJson in source_layer = gpd.read_file(geoJson) Does anyone know what's happened and how to convert correctly to PASCALVOC2012 format? Many thanks!

SCoulY commented 5 years ago

From my observation geoJson been passed into gpd.read_file(geoJson) is exactly a list object but the function accepts a string or url to read. Is this a bug?

fractional-ray commented 5 years ago

I am getting this same error when running python spacenetutilities/scripts/createDataSpaceNet.py --srcImageryDirectory RGB-PanSharpen --outputDirectory /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations --annotationType PASCALVOC2012 --convertTo8Bit --imgSizePix 400

My OS: I am running a macOS Sierra - Version 10.13.6

nrweir commented 5 years ago

Which version of spacenetutilities are you using?We recommend using the V3 branch for working with those data.

fractional-ray commented 5 years ago

I am on branch spacenetV3. :)

fractional-ray commented 5 years ago

What I did was installed the dependencies then I pulled down the repo (spacenetV3). After this I went to the directory spacenetutilities/scripts and then ran the command

python createDataSpaceNet.py /AOI_2_Vegas/AOI_2_Vegas_Train/ \ --srcImageryDirectory RGB-PanSharpen \ --outputDirectory /AOI_2_Vegas/annotations/ \ --annotationType PASCALVOC2012 \ --imgSizePix 400

nrweir commented 5 years ago

OK. Good to know. There are a few long-standing issues with this project that will be resolved in a related project to be announced shortly. In the meantime, I'm not entirely sure why your image path is being encapsulated in list(list(path)) rather than just list(path), but that's presumably the issue. I'd start there.

Sorry I can't be of more help!

fractional-ray commented 5 years ago

Okay, sounds good. I will look into this and if I find a solution to this I will post it on here so if others run in to it they can have a short term solution until the announcement comes 😄

fractional-ray commented 5 years ago

So it seeems that gpd.read_file(geoJson) is looking for an attribute that has 'startswith' but there is no attribute in the json file (example: buildings_AOI_2_Vegas_img2364.geojson) that has the name: 'startswith'

SCoulY commented 5 years ago

i was using V3 branch as well. I finally gave this up and converted the annotation myself.

alexhagen commented 5 years ago

I've forked and added

...
if isinstance(geoJson, list):
    geoJson = geoJson[0]
source_layer = gpd.read_file(geoJson) # this was existing
...

at line ~116 and ~ 144 in labeltools/pascalVOCLabel.py. This got me past the error described in this issue, but lead me to another issue:

fiona.errors.DriverError: '/qfs/projects/sgdatasc/spacenet/Vegas_processed_train/annotations/geojson/buildings/buildings__-115.3075176_36.1265426997.geojson' not recognized as a supported file format.

I'm going to look to clean up this error and if it works, create a pull request.

alexhagen commented 5 years ago

It looks like the second error corresponds to an empty geojson file, so this issue should be done. I'll create a pull request.

nrweir commented 5 years ago

@alexhagen thanks! We appreciate it.

That DriverError is a common issue for empty geojsons. I'd recommend adding the following block to catch it:

# at head of the file
from fiona.errors import DriverError
from fiona._err import CPLE_OpenFailedError  # old versions of fiona threw this error instead

try:
    source_layer = gpd.read_file(geoJson)
except (DriverError, CPLE_OpenFailedError):
    source_layer = gpd.read_file()