locationtech / rasterframes

Geospatial Raster support for Spark DataFrames
http://rasterframes.io
Apache License 2.0
240 stars 46 forks source link

How to resolve "CPLE_OpenFailed(4) "Open failed." /vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif: No such file or directory"? #550

Closed JenniferYingyiWu2020 closed 3 years ago

JenniferYingyiWu2020 commented 3 years ago

Hi, I have tried to execute the command of "spark.read.raster('/vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif')" and "rf.select(rf_crs("proj_raster").alias("value")).first()", however the below errors appear:

[1 of 1000] FAILURE(3) CPLE_OpenFailed(4) "Open failed." /vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif: No such file or directory [2 of 1000] FAILURE(3) CPLE_OpenFailed(4) "Open failed." /vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif: No such file or directory 21/03/24 09:26:38 ERROR Executor: Exception in task 61.0 in stage 9.0 (TID 201) java.lang.IllegalArgumentException: Error fetching data for one of: GDALRasterSource(/vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif)

Caused by: geotrellis.raster.gdal.MalformedDataException: Unable to construct a RasterExtent from the Transformation given. GDAL Error Code: 4

    I have to mention that before I executed the above commands I have built a Hadoop cluster on 192.168.101.201, 192.168.101.202 and 192.168.101.203. Among them, 192.168.101.201 is a master, 192.168.101.202 and 192.168.101.203 are workers. Moreover, I have installed GDAL and RasterFrames environment on the server of 192.168.101.201, 192.168.101.202 and 192.168.101.203. Besides, the dataset have been uploaded to "hdfs://192.168.101.201:9000".

(base) hduser_@jenniferwu-OptiPlex-7070:~$ hdfs dfs -ls hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif -rw-r--r-- 1 geotrellis supergroup 6712825 2021-03-23 14:17 hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif

    Lastly, my python codes to set HADOOP_USER are the followings:

from pyrasterframes.rasterfunctions import * from pyrasterframes.utils import create_rf_spark_session import os.path

HADOOP_USER = 'geotrellis'

os.environ["HADOOP_USER_NAME"] = HADOOP_USER

spark = create_rf_spark_session(**{ 'HADOOP_USER_NAME': HADOOP_USER })

    So, could you pls help to give me some suggestions on how to resolve the errors of "/vsihdfs/hdfs://192.168.101.201:9000/Jennifer_hadoop/Yunyao_Data_Set/split_20200613clip/B1.tif: No such file or directory"?
metasim commented 3 years ago

As with #549, it's hard to do anything with this without a repeatable test case. If you are able to create an integration test or some other automated mechanism to reproduce this, reopen this and associate with a PR exemplifying the #problem.

JenniferYingyiWu2020 commented 3 years ago

1 2 3 4 5 6 7 8 10

JenniferYingyiWu2020 commented 3 years ago

Hi @metasim, I have resolve the above issue. Thank you also!