GlobalDataverseCommunityConsortium / dataverse-uploader

Upload local folder/directory trees to Dataverse or Clowder repositories.
https://github.com/GlobalDataverseCommunityConsortium/dataverse-uploader/wiki/DVUploader,-a-Command-line-Bulk-Uploader-for-Dataverse
Apache License 2.0
16 stars 8 forks source link

Bad Request #3

Open JoshuaGOB opened 4 years ago

JoshuaGOB commented 4 years ago

Hi, I've been uploading a dataset to our institutional repository and it had been going perfectly until it started throwing this error:

Error response when processing /Users/jgo384/Downloads/Desktop/txtfiles/1995_09se_05.txt : Bad Request {"status":"ERROR","message":"This file already exists in the dataset. 1995_09se_05.txt"}

I've checked, that and the other files that are giving that error message are not in fact in the dataset. I'm a little perplexed since it already uploaded 1837 files without any issues. I'm attaching the latest log below. Things I've tried: use the verify flag, use the listonly flag, use the recurse flag, use all three and a combination of them together. I also regenerated the API key.

DVUploaderLog__1575320306449.log

qqmyers commented 4 years ago

My initial guess would be that the file that is failing has the same content as an earlier one. By default, the DVUploader checks for file existence by name, but I believe Dataverse checks the file hash and won't allow two files with the same content. So this would be consistent with the log you sent. (Using the -verify flag should make DVUploader check the hash - I think you'll get a different error in the log, but it still won't upload the file.)

One way to verify would be to upload that one file through the Dataverse web interface - if it is because the file has the same content as another in the dataset, you'll see an error there too.

FWIW - there is an open issue at Dataverse to allow files with the same content to be in the same dataset: https://github.com/IQSS/dataverse/issues/4813 . Until that is acted on, if this is your issue, I don't know that there's any useful work around unless you can change the file (even a space would be enough). It would probably help w.r.t. priority if you leave a comment there.

If this isn't the problem you're having, let me know and I can dig further.