datadudes / salesforce2hadoop

Import Salesforce data into Hadoop HDFS in Avro format
23 stars 13 forks source link

DatasetNotFoundException - salesforce2hadoop #4

Closed kanav-narula closed 7 years ago

kanav-narula commented 8 years ago

Hi [@Daan] :

We are using salesforce2hadoop library for exporting data from salesforce to hadoop. We are using the below command for the same.

sudo java -jar salesforce2hadoop-assembly-1.0.jar init -u myUsername -p myPassword -b /home/nfsuser/imports/salesforce -w /home/sanjay/Desktop/enterprise.wsdl -s /home/nfsuser/imports/salesforce/account Account

We are getting the below error

org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: /home/nfsuser/imports/salesforce/account. Check that JARs for null datasets are on the classpath.

Need your help in above issue.

Thanks Kanav Narula

daan commented 8 years ago

hi,

sometimes i receive these messages, I think you confuse me with someone else because i don’t do hadoop.

Thanks, daniel

On Apr 6, 2016, at 19:51, Kanav Narula notifications@github.com wrote:

Hi [@Daan https://github.com/Daan] :

We are using salesforce2hadoop library for exporting data from salesforce to hadoop. We are using the below command for the same.

sudo java -jar salesforce2hadoop-assembly-1.0.jar init -u myUsername -p myPassword -b /home/nfsuser/imports/salesforce -w /home/sanjay/Desktop/enterprise.wsdl -s /home/nfsuser/imports/salesforce/account Account

We are getting the below error

org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: /home/nfsuser/imports/salesforce/account. Check that JARs for null datasets are on the classpath.

Need your help in above issue.

Thanks Kanav Narula

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/datadudes/salesforce2hadoop/issues/4

AmritaKv commented 8 years ago

@DandyDev , I am getting the same error with latest jar ,I saw you close it but is it resolved?

DonDebonair commented 8 years ago

I closed it because it was an old issue and I initially failed to have a look at it, and didn't get any follow up on it. I will repone it if it's still an issue! One problem is, however, that I don't have a Salesforce instance anymore to test this project. But I will see what I can do.

DonDebonair commented 8 years ago

Update: I think I know what's going on. You have to provide a full and valid Hadoop-compliant filepath. That means that the protocol has to be included, for both the base path for the data and the path to the statefile. (See Data import directory )

Examples:

hdfs://hostname-of-namenode:port/path/to/dir
file:///path/to/dir

The first one writes to HDFS, the second to the local filesystem. Any Hadoop/Kite compliant FS should be supported (although I've only tested HDFS and local fs)

Can you let me know if that resolves the issue?

AmritaKv commented 8 years ago

@DandyDev You were correct!Specifying the protocol for local fs helped. You can close this one. But now leading to another configuration issue, the apiVersion is hardcoded to 32. I am creating another issue for it to change to be configurable like the Base url