Open everron opened 2 years ago
Hi @everron and thanks for raising this issue. We use Jira to track issues. Would you mind if we moved the conversation there?
Hi, no problem. I found a workaround in the meantime but I was not sure if this was a real issue or not.
Thanks. I think ignoring SIGPIPE signal
points to an issue with a CLI pipe. This likely happens when a connection is broken/closed/invalid, e.g. when an R worker crashes, and might not have anything to do with S3. "SIGPIPE means that R is trying to write somewhere which doesn't listen." (Simon Urbanek in this thread)
More on the topic:
This was my first thought since this error only occurred after a second connection attempt.
This is probably not related to S3 indeed. To solve the issue I added a retry if the call to read_parquet()
fails. It's pretty ugly but I can not figure out how to maintain the connection with S3
I'm seeing a similar issue to this post (error thrown due to SIGPIPE), but when deploying a shiny app on either shinyapps.io or deploy.
The attached SO post documents well the exact behavior that I'm seeing, but I don't see how to find a workaround when I don't have control over command-line arguments. The SO poster found a difference in whether the AWS bucket was public or private, positing that the issue could be related to how Arrow maintains AWS credentialed access to a private bucket.
Any thoughts?
Hello,
I am encountering an issue when trying to read a parquet file using
read_parquet
with anS3FileSystem
created withs3_bucket()
.I created a worker that get the last parquet file id uploaded to an S3 bucket (using S3 api) and then trying to read the file by calling the
$path()
method with my filename as the arg. I created a custom functionread_table_fromS3
to do this. This errors occurs after a second call toread_parquet()
when running my script withRscript
within a docker container built with all dependencies needed and access to AWS credentials:The error I catch :
Here is a (non reproducible) sample of what my script is doing :
Here is the
sessionInfo()
from my running container :