Open Zihonglee opened 4 months ago
The weed detection workflow requires the input image to be georeferenced. The output is a set of shapefiles, so the workflow needs to be able to map each pixel to a location. The bounding geometry will also be used to crop the parts of the image used in the workflow so you need to ensure the locations overlap at some point.
If you'd like to test the workflow out with this image, you can convert the png file to a tif with a mock geometry using the GDAL library and generate a bounding shapefile with GeoPandas.
Hi Alex,
Thanks for the suggestion. I have tried the method above to generate a bounding shapefile for the image. Below is a detailed scenario of processing my image and generating a shapefile.
I have created both files (convertor.py), which converts a PNG file to a tif file, and (convert_tif_shp.py), which converts a tif file into a shp file. I have also used the same image as shown above for this workflow.
convertor.py
from osgeo import gdal
input_file = "./weed.jpg"
output_file = "./weed.tif"
try:
ds = gdal.Open(input_file)
if ds is None:
raise Exception(f"Failed to open {input_file}")
gt = gdal.Translate(output_file, ds,
outputBounds=[-117.04671718692339, 47.03455818199527, -117.04260145498948, 47.03632996899851],
outputSRS="EPSG:4326"
)
gt = None
print(f"Conversion successful. Output file saved as {output_file}")
except Exception as e:
print(e)
In the convertor.py file, the outputBounds' value I gave is exactly the same as the geojson file from "https://github.com/microsoft/farmvibes-ai/blob/main/notebooks/heatmaps/sensor_farm_boundary.geojson".
convert_tif_shp.py
import geopandas as gpd
import rasterio
from shapely.geometry import box
tiff_file = "./weed.tif"
output_file = "./bounding.shp"
with rasterio.open(tiff_file) as src:
left, bottom, right, top = src.bounds
print(f"left: {left}, r: {right}, bottom: {bottom}, t: {top}")
bbox_polygon = box(left, bottom, right, top)
crs = src.crs.to_string()
gdf = gpd.GeoDataFrame({'geometry': [bbox_polygon]}, crs=crs)
gdf.to_file(output_file)
After generating my shapefile, I parse it in the location where it can be downloaded (http://10.42.0.1:8000/testfarmvibe/bounding.shp) to the URL parameter. However, this time I am getting a different error "ValueError: Could not find raster asset in asset list: [AssetVibe(type=None, id='cdc5cc486fedabc761bd5ecacc440832ed7ae4c84c68ee44f812e2eb96476cb7', path_or_url='/mnt/data/assets/cdc5cc486fedabc761bd5ecacc440832ed7ae4c84c68ee44f812e2eb96476cb7/bounding_box.shp', _is_local=True, _local_path='/mnt/data/assets/cdc5cc486fedabc761bd5ecacc440832ed7ae4c84c68ee44f812e2eb96476cb7/bounding_box.shp')]."
Is it possible to provide some examples like shapefile or script that helps us to generate a shapefile out of an image if the above code is not working correctly. Thanks again for all the help.
This worked for me given the posted image. Let me know if you still have issues running the workflow.
from osgeo import gdal, osr
from shapely.geometry import Polygon
import geopandas as gpd
def convert_png_to_geotiff(png_path, output_path, transform):
"""Converts a png to a geotiff given a geotransform"""
# Get the number of bands, xsize, and ysize from the PNG file
src_ds = gdal.Open(png_path)
bands = src_ds.RasterCount
xsize = src_ds.RasterXSize
ysize = src_ds.RasterYSize
# Create a new TIFF file
dst_ds = gdal.GetDriverByName('GTiff').Create(output_path, xsize, ysize, bands, gdal.GDT_Byte)
# Set the geotransform and projection
dst_ds.SetGeoTransform(transform)
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326) # WGS84
dst_ds.SetProjection(srs.ExportToWkt())
# Write to file
for i in range(bands):
band = src_ds.GetRasterBand(i + 1)
data = band.ReadAsArray()
dst_ds.GetRasterBand(i + 1).WriteArray(data)
# Close the datasets
src_ds = None
dst_ds = None
def create_shapefile(tiff_path, shapefile_path):
"""Creates a shapefile bounding a geotiff"""
ds = gdal.Open(tiff_path)
# Get the bounding box as a Shapely Polygon
gt = ds.GetGeoTransform()
ulx = gt[0]
uly = gt[3]
lrx = ulx + ds.RasterXSize * gt[1]
lry = uly + ds.RasterYSize * gt[5]
bounding_box = Polygon([(ulx, uly), (lrx, uly), (lrx, lry), (ulx, lry)])
# Close the dataset
ds = None
# Save the GeoDataFrame as a shapefile
gdf = gpd.GeoDataFrame(index=[0], crs="EPSG:4326", geometry=[bounding_box])
gdf.to_file(shapefile_path)
if __name__ == "__main__":
png_file = "Untitled.png"
tiff_file = "output.tif"
shp_file = "output.shp"
transform = [0, 1, 0, 0, 0, -1] # This is a mock geotransform
convert_png_to_geotiff(png_file, tiff_file, transform)
create_shapefile(tiff_file, shp_file)
Hi Alex,
After giving the above code a try, I am getting a different error (RuntimeError: Failed to run op weed_detection in workflow run id c58f031a-5864-4c34-b13b-22586d62b747 for input with message id 00-c58f031a58644c34b13b22586d62b747-c1a11fe835ffc40d-01. Error description: <class 'RuntimeError'>: ValueError('Input shapes do not overlap raster.') ValueError: Input shapes do not overlap raster.)
Would it be possible if you can share an example of your geojson file? Just curious, is this where u get your geojson file (https://github.com/microsoft/farmvibes-ai/blob/main/notebooks/heatmaps/sensor_farm_boundary.geojson)?
Best, Vincent
I did not use a geojson file; I used the shapefile output from the above script as the boundary for the weed detection workflow.
Below is the weed.py similar to the notebook example.
from datetime import datetime
from fiona.crs import to_string
import geopandas as gpd
from shapely import geometry as shpg
from vibe_core.client import get_default_vibe_client
from vibe_core.data import ExternalReferenceList
client = get_default_vibe_client()
boundary_shape_file = "http://10.42.0.1:8000/testfarmvibe/micro_help/output.shp"
now = datetime.now()
data_frame = gpd.read_file(boundary_shape_file).to_crs("4326")
assert data_frame is not None
geometry = shpg.mapping(data_frame.geometry.iloc[0])
inputs = ExternalReferenceList(id=url_hash, time_range=(now, now), geometry=geometry, assets=[], urls=[])
params = {"bands": [], "alpha_index": -1, "simplify": "none"}
try:
run = client.run(workflow='farm_ai/agriculture/weed_detection', name="weed_detection_example", input_data=inputs, parameters=params)
run.monitor()
except Exception as e:
print(e)
output = run.output
dv = output['result'][0]
asset = dv.assets[0]
archive_path = asset.path_or_url
After running the above workflow, with the shapefile generated by the code you shared. I am getting the following error.
Traceback (most recent call last):
File "fiona/ogrext.pyx", line 136, in fiona.ogrext.gdal_open_vector
File "fiona/_err.pyx", line 291, in fiona._err.exc_wrap_pointer
fiona._err.CPLE_OpenFailedError: '/vsimem/c3558058c97c4a7881b67db42f46f6fb' not recognized as a supported file format.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test_weed.py", line 19, in <module>
data_frame = gpd.read_file(boundary_shape_file).to_crs("4326")
File "/home/ara/.local/lib/python3.8/site-packages/geopandas/io/file.py", line 281, in _read_file
return _read_file_fiona(
File "/home/ara/.local/lib/python3.8/site-packages/geopandas/io/file.py", line 322, in _read_file_fiona
with reader(path_or_bytes, **kwargs) as features:
File "/home/ara/.local/lib/python3.8/site-packages/fiona/collection.py", line 783, in __init__
super().__init__(self.virtual_file, vsi=filetype, **kwds)
File "/home/ara/.local/lib/python3.8/site-packages/fiona/collection.py", line 243, in __init__
self.session.start(self, **kwargs)
File "fiona/ogrext.pyx", line 588, in fiona.ogrext.Session.start
File "fiona/ogrext.pyx", line 143, in fiona.ogrext.gdal_open_vector
fiona.errors.DriverError: '/vsimem/c3558058c97c4a7881b67db42f46f6fb' not recognized as a supported file format.
Apart from the above error, If I am generating my own shape file, do I need a url parse in the notebook? Would you mind sharing your workflow file so that I can compare and understand better?
Best, Vincent
It looks like you're trying to get GeoPandas to read the remote file. You can download the relevant files to the local machine.
import requests
# URLs for all necessary Shapefile components
base_url = "http://10.42.0.1:8000/testfarmvibe/micro_help/"
files = ["output.shp", "output.shx", "output.dbf"]
# Download and save each file locally
for file in files:
response = requests.get(base_url + file)
with open(file, "wb") as f:
f.write(response.content)
# Run rest of the notebook
data_frame = gpd.read_file("output.shp").to_crs("EPSG:4326")
...
After making gpd to read the output.shp locally, I am getting the following issue.
Could not find raster asset in asset list: {self.assets}
I just realized that, I need a url hash as an input to the workflow. Below is the updated of my workflow
from datetime import datetime
from fiona.crs import to_string
import geopandas as gpd
import requests
from shapely import geometry as shpg
from vibe_core.client import get_default_vibe_client
from vibe_core.data import ExternalReferenceList
client = get_default_vibe_client()
base_url = "http://10.24.102.66:8000/testfarmvibe/micro_help/"
files = ["output.shp", "output.shx", "output.dbf"]
# Download and save each file locally
for file in files:
response = requests.get(base_url + file)
with open(file, "wb") as f:
f.write(response.content)
now = datetime.now()
data_frame = gpd.read_file("output.shp")
data_frame.crs = "epsg:4326"
data_frame.to_crs(epsg=4326)
assert data_frame is not None
geometry = shpg.mapping(data_frame.geometry.iloc[0])
url_hash = str(hash(base_url + files[0]))
inputs = ExternalReferenceList(id=url_hash, time_range=(now, now), geometry=geometry, assets=[], urls=[base_url + files[0]])
params = {"bands": [], "alpha_index": -1, "simplify": "none"}
try:
run = client.run(workflow='farm_ai/agriculture/weed_detection', name="weed_detection_example", input_data=inputs, parameters=params)
run.monitor()
except Exception as e:
print(e)
output = run.output
dv = output['result'][0]
asset = dv.assets[0]
archive_path = asset.path_or_url
The urls member of the ExternalReferenceList should contain the location of the raster. In your code, this parameter should be something more like urls=[base_url + "img.tif"]
.
In which step did you encounter the bug?
FarmVibes.AI setup
Are you using a local or a remote (AKS) FarmVibes.AI cluster?
Local cluster
Bug description
I have set up my weed detection environment and a server providing access to my local files using a URL so that it can download both geojson and raster images. However, when I execute the code, I can get the workflow to unpack and download, but not running the weed detection model successfully. I looked a little into the code and noticed that the output function is returning None since the status return failed. Apart from that, I am getting an error of "ValueError: Must pass either crs or epsg." I am not entirely sure what is missing in between. Any help would be greatly appreciated
Steps to reproduce the problem
weed.png (I am not sure if this can cause an issue to the model) sensor_farm_boundary.geojson (took the geojson from here to test) https://github.com/microsoft/farmvibes-ai/blob/main/notebooks/heatmaps/sensor_farm_boundary.geojson