Closed PietrH closed 2 months ago
Somewhat more minimal reprex:
library(httr2)
url <- "https://aloftdata.s3-eu-west-1.amazonaws.com/baltrad/hdf5/frbla/2021/02/28/frbla_vp_20210228T190000Z_0xb.h5"
reqs <- list(request(url))
resps <- req_perform_parallel(reqs, paths = withr::local_tempfile())
resps[[1]]
#> <httr2_response>
#> GET
#> https://aloftdata.s3-eu-west-1.amazonaws.com/baltrad/hdf5/frbla/2021/02/28/frbla_vp_20210228T190000Z_0xb.h5
#> Status: 200 OK
#> Content-Type: binary/octet-stream
#> Error in if (!resp_has_body(x)) {: missing value where TRUE/FALSE needed
Created on 2024-06-07 with reprex v2.1.0
Looks like the problem is that req_parallel()
doesn't create the file is the body is zero bytes. This is probably due to some difference between the single and multithreaded curl API, and could be resolved in Performance$succeed
by creating the file if it doesn't exist.
I'm using sequential downloads as a fallback on failed parallel downloads anyway, so by adding an extra condition to the fallback I was able to workaround this issue.
Ideally req_perform_parallel()
would just create the empty files, otherwise, an extra error message would be helpful so it's easier to figure out what's going wrong.
I think we can just fix this bug rather than emitting a message.
I'm downloading a bunch of files in parallel, some turn out to be 0 bytes.
Example of file: https://aloftdata.s3-eu-west-1.amazonaws.com/baltrad/hdf5/frbla/2021/02/28/frbla_vp_20210228T190000Z_0xb.h5
This works;
This works as well:
However, this doesn't work in parallel. Initially I thought it might be due to the retry not being allowed in parallel (although the documentation claims it would just get ignored), but it doesn't work with this omitted either:
I trigger a condition in
resp_has_body()
that doesn't have a clear message:reprex
Created on 2024-06-07 with reprex v2.1.0