Open CEBerndsen opened 5 years ago
Thanks for reporting.
The latest version on GitHub (v0.2.3) will re-attempt the request a few times if the API throws a 500 error. Can you try updating and see if the problem persists?
I haven't tried uploading that many files before so I'll do a little testing on my end as well.
Updated to v0.2.3 and tried uploading again two ways. In first attempt, tried to simply use the update in batch code and got this error:
osf_retrieve_node("3r7nw") %>%
osf_ls_files() %>%
filter(name == "2019-3-4 tetramer with amylose in KCl") %>%
walk(files, osf_upload, x = ., overwrite = TRUE)
Error in data.matrix(data) : (list) object cannot be coerced to type 'double' In addition: Warning messages: 1: In data.matrix(data) : NAs introduced by coercion 2: In data.matrix(data) : NAs introduced by coercion
So I then deleted the folder via the web interface and tried the first code from above which makes the directory and then uploads files to it. Got the 500 error again and only 400 files uploaded.
osf_retrieve_node("3r7nw") %>%
osf_mkdir(., path = "2019-3-4 tetramer with amylose in KCl") %>%
walk(files, osf_upload, x = .)
Error: Internal Server Error HTTP status code 500.
Let me know if I can try other approaches and help. Thanks!
Thanks. I ran a couple of tests that attempted to upload 1500 files and was able to reproduce the same error. Unfortunately, sometimes it worked and sometimes it failed. I'm going to leave this open for now. These HTTP codes > 500 correspond to "unexpected errors" on the server, so we may need to loop in one of the OSF devs to ultimately solve it. In the meantime, this highlighted some inefficiencies in osf_upload()
that may partially address the issue by reducing the number of API calls made.
Are many of these files relatively small? Like in the "seconds or less to upload" size?
Tend to be 10 kB to 5 Mb, just lots of them.
C.E. Berndsen, Ph.D. Department of Chemistry and Biochemistry James Madison University
On Jun 5, 2019, at 4:44 PM, Brian J. Geiger notifications@github.com<mailto:notifications@github.com> wrote:
Are many of these files relatively small? Like in the "seconds or less to upload" size?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_CenterForOpenScience_osfr_issues_99-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAE43K2QAS2GUHBIFOOXTGTTPZAQTVA5CNFSM4G3UE7S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXA6YRI-23issuecomment-2D499248197&d=DwMCaQ&c=eLbWYnpnzycBCgmb7vCI4uqNEB9RSjOdn_5nBEmmeq0&r=yjHWhOXV5AnWLtVGr7q7Rw&m=NxdKQQNiZLoJnQJsdVOptcLFVUtfnfxkjx7zHwviQrM&s=STycSUBhgBk19ZPpr1WO1_GTqkCEKJGNIKRG9rksYx4&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AE43K2TTK6UL2VFCVC2Y2TDPZAQTVANCNFSM4G3UE7SQ&d=DwMCaQ&c=eLbWYnpnzycBCgmb7vCI4uqNEB9RSjOdn_5nBEmmeq0&r=yjHWhOXV5AnWLtVGr7q7Rw&m=NxdKQQNiZLoJnQJsdVOptcLFVUtfnfxkjx7zHwviQrM&s=M7qfZQGzz4YCjda_zfMI6d6LZeyLX8jI84J1R637xqQ&e=.
Hi @brianjgeiger, thanks for checking into this. I used hundreds of small text files in my testing. Are you thinking it's rate limiting issue?
Hi, @aaronwolen, no, I think it's because we have an inefficiency or two on capturing provenance data for file uploads, and it's causing the thread to eventually time out. It should be fixed in an upcoming version, but I don't have a date on that yet. But slowing down the requests will definitely keep you from seeing the error.
Thanks for the info.
It should be fixed in an upcoming version
Is there a relevant PR or Issue I can monitor to determine when it's fixed?
In the meantime, do you have recommendations for parameters I should use to moderate requests (eg, delay x
seconds for every x
files)?
When using the osf_upload function in combination with purrr::walk I received an inconsistent error. I could upload ~50 files with no problems.
Later when trying the same basic code with a much larger directory (~1000) files only 700 files uploaded before I received this message:
Code that failed with error above:
I adjusted the code figuring it was a timeout issue and tried to complete the upload with:
and got this error:
Note:
overwrite = FALSE
failed to work, which is why overwrite is set to TRUEAs I stated originally, the same basic code worked for 50 files, but larger uploads failed to fully complete.
Have enjoyed using the package and this won't stop me from using it, just 1000 files is a standard size project for me so batch uploads without having to use the web interface is really useful.