martibosch / detectree-examples

Example computational workflows to classify tree/non-tree pixels in Zurich using DetecTree
GNU General Public License v3.0
19 stars 15 forks source link

Issues running notebooks/aussersihl-canopy.ipynb #4

Closed xanderkoo closed 2 years ago

xanderkoo commented 4 years ago

I'm working on creating an app that marks tree-covered areas on Google Maps, and generates a wildfire risk heatmap on top of that. The project is for a hackathon, so I wanted to piggyback off of the model from the demo for now, but I'm having trouble with getting through the Jupyter notebooks in the demo.

When running the!make aussersihl_tiles block, I get the following output:

python detectree_example/get_tiles_to_download.py data/raw/orthoimg/ortho_sommer14.shp \
        "Zurich Aussersihl" data/interim/aussersihl_tiles/intersecting_tiles.csv --op intersects
2020-09-12 00:02:11,510 - __main__ - INFO - Querying Nominatim for boundaries for `Zurich Aussersihl`
2020-09-12 00:03:04,273 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2020-09-12 00:03:04,578 - __main__ - INFO - Found 1 intersecting tiles
2020-09-12 00:03:04,580 - __main__ - INFO - Dumped list of intersecting tiles to data/interim/aussersihl_tiles/intersecting_tiles.csv
python detectree_example/make_tiles.py data/interim/aussersihl_tiles/intersecting_tiles.csv data/interim/aussersihl_tiles data/interim/aussersihl_tiles/tiles.csv \
        --nominatim-query "Zurich Aussersihl" \
        --exclude-nominatim-query "Lake Zurich" --keep-raw
100%|█████████████████████████████████████████████| 1/1 [00:10<00:00, 10.76s/it]
2020-09-12 00:03:16,754 - __main__ - INFO - Querying Nominatim for boundaries for `Zurich Aussersihl`
2020-09-12 00:04:35,804 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2020-09-12 00:04:36,075 - __main__ - INFO - Querying Nominatim for boundaries for `Lake Zurich`
detectree_example/make_tiles.py:152: UserWarning: CRS mismatch between the CRS of left geometries and the CRS of right geometries.
Use `to_crs()` to reproject one of the input geometries to match the CRS of the other.

Left CRS: +proj=somerc +lat_0=46.95240555555556 +lon_0=7.439 ...
Right CRS: None

  output_tiles_ser = gpd.sjoin(tiles_gdf,
2020-09-12 00:07:47,291 - __main__ - INFO - removed 24 tiles that do not intersect with the extent of Zurich Aussersihl
2020-09-12 00:07:47,292 - __main__ - INFO - Dumped list of downscaled tiles to data/interim/aussersihl_tiles/tiles.csv

...and my directory looks like this at the end, with the .tif file just being a black rectangle, and only one line in both intersecting_tiles.csv (331,1091-233.tif) and tiles.csv (3,data/interim/aussersihl_tiles/1091-233_03.tif), respectively.

Screen Shot 2020-09-12 at 12 11 37 AM

Mid-execution of the Makefile, I'm seeing what's below, where it's just another bunch of black rectangles.

Screen Shot 2020-09-12 at 12 07 01 AM

(Perhaps as a result) I am also seeing the following output from the train/test split block:

Screen Shot 2020-09-12 at 12 15 42 AM

Running the !make tiles block in background.ipynb yields this output:

mkdir data/interim/tiles
python detectree_example/get_tiles_to_download.py data/raw/orthoimg/ortho_sommer14.shp Zurich   data/interim/tiles/intersecting_tiles.csv
2020-09-12 00:35:51,832 - __main__ - INFO - Querying Nominatim for boundaries for `Zurich`
2020-09-12 00:38:56,881 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2020-09-12 00:39:01,075 - __main__ - INFO - Found 0 intersecting tiles
2020-09-12 00:39:01,077 - __main__ - INFO - Dumped list of intersecting tiles to data/interim/tiles/intersecting_tiles.csv
python detectree_example/make_tiles.py data/interim/tiles/intersecting_tiles.csv data/interim/tiles data/interim/tiles/downsampled_tiles.csv
Traceback (most recent call last):
  File "detectree_example/make_tiles.py", line 174, in <module>
    main()
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "detectree_example/make_tiles.py", line 71, in main
    tile_filenames = pd.read_csv(intersecting_tiles_csv_filepath,
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 452, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 936, in __init__
    self._make_engine(self.engine)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1168, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1998, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
make: *** [data/interim/tiles/downsampled_tiles.csv] Error 1

There might be several issues here, but am I doing something wrong? Is this an issue with Nominatim? Any advice would be greatly appreciated.

xanderkoo commented 4 years ago

Any simple temporary workarounds (e.g. manually downloading the tiff/csv files, or a pretrained model) would also be very helpful. 😅

martibosch commented 4 years ago

Hello @xanderkoo, thank you for reporting this.

There was indeed a mistake in make_tiles.py, i.e., the geometry representing the extent of interest was missing the CRS in the spatial join to select the tiles that intersect such extent, which is why all the tiles were being dumped at the end. I have amended this in 1ed25fc.

Please let me know if this works and/or about any other problems that you might encounter. Best, Martí

xanderkoo commented 4 years ago

Thanks for the quick reply. Running !make tiles in background.ipynb yields an empty intersecting_tiles.csv, and the following console output (which I believe is the same as before)

python detectree_example/get_tiles_to_download.py data/raw/orthoimg/ortho_sommer14.shp Zurich   data/interim/tiles/intersecting_tiles.csv
2020-09-12 08:41:54,745 - __main__ - INFO - Querying Nominatim for boundaries for `Zurich`
2020-09-12 08:45:02,719 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2020-09-12 08:45:06,603 - __main__ - INFO - Found 0 intersecting tiles
2020-09-12 08:45:06,606 - __main__ - INFO - Dumped list of intersecting tiles to data/interim/tiles/intersecting_tiles.csv
python detectree_example/make_tiles.py data/interim/tiles/intersecting_tiles.csv data/interim/tiles data/interim/tiles/downsampled_tiles.csv
Traceback (most recent call last):
  File "detectree_example/make_tiles.py", line 174, in <module>
    main()
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "detectree_example/make_tiles.py", line 71, in main
    tile_filenames = pd.read_csv(intersecting_tiles_csv_filepath,
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 452, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 936, in __init__
    self._make_engine(self.engine)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1168, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/Users/xander/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1998, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 540, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
make: *** [data/interim/tiles/downsampled_tiles.csv] Error 1

and the make block in aussersihl-canopy.ipynb yields the same thing as before I think, with the same files in the aussersihl_tiles dir:

python detectree_example/make_tiles.py data/interim/aussersihl_tiles/intersecting_tiles.csv data/interim/aussersihl_tiles data/interim/aussersihl_tiles/tiles.csv \
        --nominatim-query "Zurich Aussersihl" \
        --exclude-nominatim-query "Lake Zurich" --keep-raw
100%|█████████████████████████████████████████████| 1/1 [00:10<00:00, 10.34s/it]
2020-09-12 08:49:49,947 - __main__ - INFO - Querying Nominatim for boundaries for `Zurich Aussersihl`
2020-09-12 08:51:22,074 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2020-09-12 08:51:22,396 - __main__ - INFO - Querying Nominatim for boundaries for `Lake Zurich`
2020-09-12 08:59:54,858 - __main__ - INFO - removed 24 tiles that do not intersect with the extent of Zurich Aussersihl
2020-09-12 08:59:54,859 - __main__ - INFO - Dumped list of downscaled tiles to data/interim/aussersihl_tiles/tiles.csv

I'm also seeing that the Nominatim queries take upwards of 3-5 minutes each, is this normal behavior? Thanks again.

martibosch commented 4 years ago

Hello again @xanderkoo,

the commands work properly, i.e., it founds 14 and 4 intersecting tiles for Zurich (background.ipynb) and Zurich Aussersihl (aussersihl-canopy.ipynb) respectively, and in the latter it removes 63 tiles that do not intersect with the extent. On the other hand, Nominatim queries take about two seconds in my laptop, so there is definitely something wrong about that too.

Therefore, in order to debug this, I'd need more information about your environment, e.g., the output of running conda env export.

Best, Martí

xanderkoo commented 4 years ago

Here's the output of conda env export:

name: detectree
channels:
  - conda-forge
  - defaults
dependencies:
  - affine=2.3.0=py_0
  - attrs=20.2.0=pyh9f0ad1d_0
  - boost-cpp=1.72.0=hdf9ef73_0
  - branca=0.4.1=py_0
  - brotlipy=0.7.0=py37h9bfed18_1000
  - bzip2=1.0.8=haf1e3a3_3
  - c-ares=1.16.1=haf1e3a3_3
  - ca-certificates=2020.6.20=hecda079_0
  - cairo=1.16.0=hec6a9b0_1003
  - certifi=2020.6.20=py37hc8dfbb8_0
  - cffi=1.14.1=py37hf5b7abd_0
  - cfitsio=3.470=hdf94aef_6
  - chardet=3.0.4=py37hc8dfbb8_1006
  - click=7.1.2=pyh9f0ad1d_0
  - click-plugins=1.1.1=py_0
  - cligj=0.5.0=py_0
  - contextily=1.0.0=py_0
  - cryptography=3.1=py37h94e4008_0
  - curl=7.71.1=hcb81553_5
  - cycler=0.10.0=py_2
  - decorator=4.4.2=py_0
  - descartes=1.1.0=py_4
  - expat=2.2.9=hb1e8313_2
  - fiona=1.8.13=py37h9c7205d_1
  - folium=0.11.0=py_0
  - fontconfig=2.13.1=h6b1039f_1001
  - freetype=2.10.2=h8da9a1a_0
  - freexl=1.0.5=h0b31af3_1002
  - gdal=3.0.4=py37h08e9697_10
  - geographiclib=1.50=py_0
  - geopandas=0.8.1=py_0
  - geopy=2.0.0=pyh9f0ad1d_0
  - geos=3.8.1=h4a8c4bd_0
  - geotiff=1.6.0=hd8796ba_0
  - gettext=0.19.8.1=h46ab8bc_1002
  - giflib=5.2.1=h0b31af3_2
  - glib=2.66.0=hdb5fb44_0
  - hdf4=4.2.13=h84186c3_1003
  - hdf5=1.10.6=nompi_haae91d6_101
  - icu=64.2=h6de7cb9_1
  - idna=2.10=pyh9f0ad1d_0
  - jinja2=2.11.2=pyh9f0ad1d_0
  - joblib=0.16.0=py_0
  - jpeg=9d=h0b31af3_0
  - json-c=0.13.1=h575e443_1002
  - kealib=1.4.13=h40102fb_1
  - kiwisolver=1.2.0=py37ha1cc60f_0
  - krb5=1.17.1=h75d18d8_3
  - laspy=1.7.0=pyh5ca1d4c_0
  - lastools=20171231=h0a44026_1000
  - laszip=3.4.3=h4a8c4bd_1
  - lcms2=2.11=h174193d_0
  - libblas=3.8.0=17_openblas
  - libcblas=3.8.0=17_openblas
  - libcurl=7.71.1=h9bf37e3_5
  - libcxx=10.0.1=h5f48129_0
  - libdap4=3.20.6=h993cace_1
  - libedit=3.1.20191231=h0678c8f_2
  - libev=4.33=haf1e3a3_1
  - libffi=3.2.1=hb1e8313_1007
  - libgdal=3.0.4=h242383b_10
  - libgfortran=4.0.0=2
  - libiconv=1.16=haf1e3a3_0
  - libkml=1.3.0=h88bc94a_1012
  - liblapack=3.8.0=17_openblas
  - libnetcdf=4.7.4=nompi_hc5b2cf3_105
  - libnghttp2=1.41.0=h7580e61_2
  - libopenblas=0.3.10=openmp_h63d9170_4
  - libpng=1.6.37=hb0a8c7a_2
  - libpq=12.3=h489d428_0
  - libspatialindex=1.9.3=h4a8c4bd_3
  - libspatialite=4.3.0a=h658e6c1_1038
  - libssh2=1.9.0=h8a08a2b_5
  - libtiff=4.1.0=h2ae36a8_6
  - libwebp-base=1.1.0=h0b31af3_3
  - libxml2=2.9.10=h53d96d6_0
  - llvm-openmp=10.0.1=h28b9765_0
  - lz4-c=1.9.2=hb1e8313_3
  - markupsafe=1.1.1=py37h9bfed18_1
  - matplotlib=3.3.1=1
  - matplotlib-base=3.3.1=py37h886f89f_1
  - mercantile=1.1.6=pyh9f0ad1d_0
  - munch=2.5.0=py_0
  - ncurses=6.2=hb1e8313_1
  - networkx=2.5=py_0
  - numpy=1.16.4=py37h6b0580a_0
  - olefile=0.46=py_0
  - openjpeg=2.3.1=h254dc36_3
  - openssl=1.1.1g=haf1e3a3_1
  - osmnx=0.16.0=pyh9f0ad1d_0
  - pandas=1.1.2=py37hdadc0f0_0
  - pcre=8.44=h4a8c4bd_0
  - pillow=7.2.0=py37hfd78ece_1
  - pip=20.2.3=py_0
  - pixman=0.38.0=h01d97ff_1003
  - poppler=0.87.0=h3232a60_1
  - poppler-data=0.4.9=1
  - postgresql=12.3=h62ab893_0
  - proj=7.0.0=h45baca5_5
  - pycparser=2.20=pyh9f0ad1d_2
  - pyopenssl=19.1.0=py_1
  - pyparsing=2.4.7=pyh9f0ad1d_0
  - pyproj=2.6.1.post1=py37hbd4ead9_0
  - pysocks=1.7.1=py37hc8dfbb8_1
  - python=3.7.8=hc9dea61_1_cpython
  - python-dateutil=2.8.1=py_0
  - python-slugify=4.0.1=pyh9f0ad1d_0
  - python_abi=3.7=1_cp37m
  - pytz=2020.1=pyh9f0ad1d_0
  - rasterio=1.1.5=py37hfd922e9_1
  - readline=8.0=h0678c8f_2
  - requests=2.24.0=pyh9f0ad1d_0
  - rtree=0.9.4=py37h8526d28_1
  - scikit-learn=0.23.1=py37hf5857e7_0
  - scipy=1.5.1=py37hce1b9e5_0
  - setuptools=49.6.0=py37hc8dfbb8_0
  - shapely=1.7.0=py37hfcf0db4_3
  - six=1.15.0=pyh9f0ad1d_0
  - snuggs=1.4.7=py_0
  - sqlite=3.33.0=h960bd1c_0
  - tbb=2019.9=ha1b3eb9_1
  - text-unidecode=1.3=py_0
  - threadpoolctl=2.1.0=pyh5ca1d4c_0
  - tiledb=1.7.7=h84aa2a7_3
  - tk=8.6.10=hb0a8c7a_0
  - tornado=6.0.4=py37h9bfed18_1
  - tzcode=2020a=h0b31af3_0
  - unidecode=1.1.1=py_0
  - urllib3=1.25.10=py_0
  - wheel=0.35.1=pyh9f0ad1d_0
  - xerces-c=3.2.2=h8f8adb3_1004
  - xz=5.2.5=haf1e3a3_1
  - zlib=1.2.11=h7795811_1009
  - zstd=1.4.5=h289c70a_2
  - pip:
    - appnope==0.1.0
    - backcall==0.2.0
    - cloudpickle==1.6.0
    - dask==2.26.0
    - detectree==0.3.1
    - distributed==2.26.0
    - heapdict==1.0.1
    - imageio==2.9.0
    - ipykernel==5.3.4
    - ipython==7.18.1
    - ipython-genutils==0.2.0
    - jedi==0.17.2
    - jupyter-client==6.1.7
    - jupyter-core==4.6.3
    - msgpack==1.0.0
    - parso==0.7.1
    - pexpect==4.8.0
    - pickleshare==0.7.5
    - prompt-toolkit==3.0.7
    - psutil==5.7.2
    - ptyprocess==0.6.0
    - pygments==2.6.1
    - pymaxflow==1.2.12
    - pywavelets==1.1.1
    - pyyaml==5.3.1
    - pyzmq==19.0.2
    - scikit-image==0.17.2
    - sortedcontainers==2.2.2
    - tblib==1.7.0
    - tifffile==2020.9.3
    - toolz==0.10.0
    - tqdm==4.48.2
    - traitlets==5.0.4
    - wcwidth==0.2.5
    - zict==2.0.0
prefix: /Users/xander/opt/anaconda3/envs/detectree

Using openstreetmap.org (which should correspond to Nominatim) takes about 2 seconds for me as well, so I'm not sure about that.

martibosch commented 4 years ago

Hello again @xanderkoo,

apparently recent releases of osmnx include changes that are incompatible with the previous releases. I have done some amendments to this repository so that it works with osmnx>=0.15, and I have also corrected some other mistakes.

Please update your copy of the repository to c7db1ec and let me know if you encounter any other problem. Thank you.

Best, Martí

martibosch commented 4 years ago

Hello @xanderkoo,

are there any news regarding this issue? Otherwise, may I close it?

Best, Martí

xanderkoo commented 4 years ago

Hi Martí, sorry for the late response--I just saw this message in my inbox. I am no longer actively using DetecTree, but I'll test it out later this week to see if the fix works.

martibosch commented 2 years ago

Hello! I am closing this due to inactivity, but feel free to reopen.