PecanProject / pecan-status-board

workflow integration testing status board
Other
2 stars 3 forks source link

Bug: HTTP Request Timeout to 141.142.217.168 in R Script Affecting GitHub Actions Data Generation #15

Open Sweetdevil144 opened 4 months ago

Sweetdevil144 commented 4 months ago

Description

We're encountering a timeout error in our R script during the data generation phase of a GitHub Actions workflow. The error occurs when the script tries to make an HTTP request to a server, leading to a connection timeout.

NOTE : This issue has been created after manually running github/workflows/manual-test.yaml

Error Details

The specific error message is as follows:

Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [141.142.217.168] Connection timed out after 10000 milliseconds

This issue is happening in the inst/run-test-list.R script, within the section where a GET request is made using the curl package.

Possible Causes and Steps Taken

  1. Server Availability: We need to check if the server at 141.142.217.168 is currently operational and accessible. This is a critical step to ensure that the issue isn't on the server end.

I had a discussion with @infotroph related to this as follow :

I haven’t traced this through in detail , but I suspect the major thing driving failures in the test ymls is that a number of scripts in pecan-status-board want to connect to "http://141.142.217.168/, which I believe was a virtual machine at NCSA that’s apparently down now — I’m not sure if on purpose or by neglect. I don’t know why http://141.142.217.168/ is not longer running or whether the Correct Fix is to get it running again or to change which server the statusboard tries to connect to.

Original comments by Chris for this bug.

  1. Timeout Duration: The default timeout of 10 seconds may be insufficient. Consideration for increasing the timeout duration in the curl request should be made.

  2. Network Issues: We should investigate if there are any network connectivity problems on the client side, which might be causing this issue.

  3. Error Handling in Script: While this doesn't solve the root problem, improving error handling in the script can help manage such situations better, either by retrying the request or providing more detailed logging.

Steps to Reproduce

  1. Trigger the GitHub Actions workflow defined in .github/workflows/manual-test.yaml.
  2. Observe the error in the 'Generate data' step of the workflow.

Expected Behavior

The script should successfully make the HTTP request without timing out, allowing the data generation process to complete.

This error is critical as it halts our data generation process, which is a key part of our workflow. Furthermore, as @infotroph said, I too think major FALSE and FAILURES within our workflow is caused due to this issue. Fixing it would cause a number of tests to pass further easing the ideas of the current GSoC project associated with this.


Any insights or recommendations would be greatly appreciated. Tagging @mdietze and @dlebauer for further insights into this issue.

Sweetdevil144 commented 4 months ago

Any takes on this?