Open sslavian812 opened 5 years ago
@sslavian812 - What content is present in the file? Can you check whether it is valid CSV file?
Also try to use the latest spark-sftp connector as we solved similar issue like this
Hi @samuel-pt ,thank you for the answer. I'm still struggling with NPE while reading csv file.
whether it is valid CSV file?
It's a text file, a regular csv. I can download it with curl
and open on local machine.
latest spark-sftp
Upgraded from 1.3 to com.springml:spark-sftp_2.11:1.1.5
, didn't help.
Seems, that I'll have to implement something custom, say download csv with apache-commons-vfs, upload it to s3 and then read into dataframe using standard api.
Yes this is a issue, Even i'm facing it too.. java.lang.NullPointerException --> When reading a exsisting fie and even upgrading - com.springml:spark-sftp_2.11:1.1.5 didnt helped
Let me know if any other option can be implemented
Can you provide a sample of the file to be tested?
You can use any file either csv or txt. How it does perform is It is trying to perform two things at a same time
thats why it is failing which is a bug.
If u do a try catch block try(read the file but do not create dataframe){ copy the data in the dbfs }catch(once its copied u can load the file to dataframe)
this is a temprory solution but this is a bug
@AJAnujsharma Can you please provide the code snippet that you used to sftp from databricks. Not sure I get what you are doing in your catch block. Thanks in advance.
@AJAnujsharma Can you please provide the code snippet that you used to sftp from databricks. Not sure I get what you are doing in your catch block. Thanks in advance.
That works for me! example sftp server used here // try/except is the workaround try: df = (spark\ .read\ .format("com.springml.spark.sftp")\ .option("host", "test.rebex.net")\ .option("username", sftp_user)\ .option("password", sftp_password)\ .option("fileType", "txt")\ .option("tempLocation", "/dbfs/tmp/")\ .load("/pub/example/readme.txt")) except: df = (spark\ .read\ .format("com.springml.spark.sftp")\ .option("host", "test.rebex.net")\ .option("username", sftp_user)\ .option("password", sftp_password)\ .option("fileType", "txt")\ .option("tempLocation", "/tmp/")\ .load("/pub/example/readme.txt"))
@sauerch91 were you able to write to a sftp server? If so can you give me the snippet please.. seems like the library cannot ready from the temporary dbfs location..
I am with @yuvapraveen . Does someone have a working example ? Struggling to write to SFTP server and get NPE with newest version 1.0.3.
I'm trying to read a csv file from sftp server and convert to dataframe. The file is in
/ppreports/outgoing/MY.CSV
. I can see it when logging in with a GUI.I get
If I try to read non-existing file:
Then I'll predictable get
file not found
:Thus, I conclude that file is there and
spark-sftp
finds it, but fails to download. What should I do?