Open rkomandu opened 2 weeks ago
Tried to reproduce the issue with both the local system and HA Scale system and could not reproduce the issue, will try a few more times. Wen last time the issue face reproduced the complete log was not enabled and need to reproduce the issue with all logs enabled. cc : @rkomandu @romayalon
Environment info
noobaa d/s rpm = noobaa-core-5.17.1-20241104.el9 (standalone Noobaa)
Actual behavior
1. Ran the upload of an object with enablemd5 is set to true, then with HA functionality from one node to other CES IP moved , IO continued while it is in HA process but at the end it reported as shown below
"upload failed: ./file_50G to s3://newbucket-ha-reg/file_50G-obj An error occurred (InternalError) when calling the CompleteMultipartUpload operation (reached max retries: 4): We encountered an internal error. Please try again."
However in the noobaa.logs on the node (after all the parts are uploaded), the following mismatch of etag has been logged
Expected behavior
1. What is the reason for the etag mismatch (the system is for a RR setup DNS) so the IO continues on the HA failover mechanism , it shouldn't get the error
Steps to reproduce
1. Upload a large object generate an assert for gpfs daemon (it stop all services, starts gpfs daemon, Start Services back) upload should be successful
More information - Screenshots / Logs / Other output
I am posting the logs of noobaa on the protocol nodes (2 of them) and gpfs logs as well.
https://ibm.ent.box.com/folder/292622193321