Closed ajaykushwah closed 4 years ago
From what it looks like Minio is getting an error from the backend here @ajaykushwah do you have for example tcpdumps available for us to look at the traffic?
@harshavardhana, thanks for the quick response.
No, I don't have tcpdump and it would not be practical to catch it in action. It happens very infrequently with no pattern. Sometimes without an issue for days. TCPdump would be huge if it runs that long. We could try it as a last options if nothing else is obvious.
I can attach the client side logs for the PUT request, I mentioned above to see how client is retrying if that helps. It is a simple case of timeout after 5 min, closing the connection and retrying it again.
What I am most interested is in, what is the setting for put/operation timeouts on the minio server side so that I can either change that to 5 mins or ensure that client waits that long before giving up on the connection and retrying the request.
Similarly, what is the the expected minio behavior on these race like conditions? Does a failed request on certain object alters the existing state of that object, which was put successfully from previous requests? ( or the requests that finished first and successfully in a race like condition)
What I am most interested is in, what is the setting for put/operation timeouts on the minio server side so that I can either change that to 5 mins or ensure that client waits that long before giving up on the connection and retrying the request.
If longer than 5 minutes network stops reading data, meaning we can't write fast enough to the backend then the client times out. Please share the entire Minio server logs. AFAICS backend here is giving up upon load, so Minio is unreliable because your backend is unreliable.
Similarly, what is the the expected minio behavior on these race like conditions? Does a failed request on certain object alters the existing state of that object, which was put successfully from previous requests? ( or the requests that finished first and successfully in a race like condition)
There is no race condition here, Minio doesn't write anything if it cannot write any - it is strictly consistent.
I am leaving the client side log snippet for request and retries of bad object 000008a9.data. (Detailed minio logs for this objects are available in my opening comment)
minio Snip REQUEST objectAPIHandlers.PutObjectHandler-fm] [154935716.226429] [2019-02-05 08:59:22 +0000] PUT /fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data Snip Ends
Client log Snip Log:[09:04:24.254304 1 (40) /retry info] retry 1 of put fst03 / 01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data after receiving 28: "Timeout was reached", status=Connection timed out, httpStatusC stream:stdout time:February 5th 2019, 09:04:24.254 pid:1 thread_id:40 module_path:/retry level:info msg:retry 1 of put fst03 /01d/...
log:[09:09:28.263051 1 (40) /retry info] retry 2 of put fst03 / 01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data after receiving 28: "Timeout was reached", status=Connection timed out, httpStatusC stream:stdout time:February 5th 2019, 09:09:28.263 pid:1 thread_id:40 module_path:/retry level:info msg:retry 2 of put fst03/...
log:[09:14:15.475632 1 (40) /retry info] retry 3 of put fst03 / 01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data after receiving 7: "Couldn't connect to server", status=Connection timed out, httpS stream:stdout time:February 5th 2019, 09:14:15.475 pid:1 thread_id:40 module_path:/retry level:info msg:retry 3 of put fst03/...
log:[09:14:20.813843 1 (40) /retry info] retry 3 of put fst03 / 01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data successful stream:stdout time:February 5th 2019, 09:14:20.813 pid:1 thread_id:40 module_path:/retry level:info msg:retry 3 of put fst03/...
If longer than 5 minutes network stops reading data, meaning we can't write fast enough to the backend then the client times out. Please share the entire Minio server logs. AFAICS backend here is giving up upon load, so Minio is unreliable because your backend is unreliable.
I am not ruling that out and I can certainly send you the logs I have. These are only http trace logs and the errors caught on stdout, which seems pretty verbose. Would that be all or is there a special setting needed to create the full system logs?
There is no race condition here, Minio doesn't write anything if it cannot write any - it is strictly consistent.
I have outlined the put logs from the client side in my last comment and there it clearly reflects that it got successful put response after few timeouts/retries. how do we reconcile the behavior as seen from client side?
I have outlined the put logs from the client side in my last comment and there it clearly reflects that it got successful put response after few timeouts/retries. how do we reconcile the behavior as seen from client side?
Of course it can succeed depending on if backend allowed the request, so its not a race condition. Race condition is something very different.
I am not ruling that out and I can certainly send you the logs I have. These are only http trace logs and the errors caught on stdout, which seems pretty verbose. Would that be all or is there a special setting needed to create the full system logs?
Yes full system logs and also the minio servers logs you can attach them here (filtering any sensitive information)
I am expecting that if a particular object PUT has been returned a OK by minio to the client, it should return with all of its metadata during GET.
If there are any race conditions that occur from retires due to network shakiness or server slowness, they should be handled correctly by the server.
Missing fs.json
means that the file was deleted by someone i.e perhaps its the client itself sending a DELETE upon a failure? It is also possible that during rename of fs.json
we got an error from the backend - but we make NAS gateway resilient to ignore that because FS is allowed to work without fs.json
.
Of course it can succeed depending on if backend allowed the request, so its not a race condition. Race condition is something very different.
by race, I meant that requestID for original request is finishing later than the retry requestId as seen in the system minio http trace logs. but I am not obviously a minio expert here, so I will let the experts (you) determine what is going on.
Yes full system logs and also the minio servers logs you can attach them here (filtering any sensitive information)
just reconfirming that minio whether logs collected with export MINIO_HTTP_TRACE=.. are good enough as minio server logs.
If there is another setting to collect more detailed minio logs, please let me know.
Missing
fs.json
means that the file was deleted by someone i.e perhaps its the client itself sending a DELETE upon a failure?
I checked both client and minio http logs. There is no delete call on either sender or receiver side.
It is also possible that during rename of
fs.json
we got an error from the backend - but we make NAS gateway resilient to ignore that because FS is allowed to work withoutfs.json
.
This is more likely. Would there be a error or log message that would be generated in that case?
stdout captured from minio server
http_trace logs from the time around the upload of object in question minio_http_frag_35.gz
system cpu
root@cbs-qa-minio-01:/home/cbs# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz Stepping: 1 CPU MHz: 1197.139 CPU max MHz: 1700.0000 CPU min MHz: 1200.0000 BogoMIPS: 3391.95 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-5 NUMA node1 CPU(s): 6-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts flush_l1d
system memory root@cbs-qa-minio-01:/opt/minio/ajay# free -h total used free shared buff/cache available Mem: 62G 2.6G 21G 672K 38G 59G Swap: 8.0G 22M 8.0G
File system: root@cbs-qa-minio-01:/opt/minio/ajay# df -h /data Filesystem Size Used Avail Use% Mounted on /dev/mapper/data-lvol0 748T 262T 487T 35% /data
root@cbs-qa-minio-01:/opt/minio/ajay# df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/data-lvol0 16063177664 393420568 15669757096 3% /data
root@cbs-qa-minio-01:/opt/minio/ajay# file -sL /dev/mapper/data-lvol0 /dev/mapper/data-lvol0: SGI XFS filesystem data (blksz 4096, inosz 512, v2 dirs)
File system syslog does not have anything from minio
It does have occasional log messages reproduced below, which I am trying to root cause but none of the bad objects seems to be around the vicinity of these swapper failure
root@cbs-qa-minio-01:/opt/minio/ajay# zcat /var/log/syslog.7cat /var/log/syslog.7 ...
Feb 26 18:45:07 cbs-qa-minio-01 kernel: [2397829.763873] swapper/10 cpuset=/ mems_allowed=0-1
Feb 26 18:45:07 cbs-qa-minio-01 kernel: [2397829.763885] CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.15.0-44-generic #47-Ubuntu
Feb 26 18:45:07 cbs-qa-minio-01 kernel: [2397829.763887] Hardware name: FUJITSU PRIMERGY RX2510 M2/D3279-H1, BIOS V5.0.0.11 R1.20.0 for D3279-H1x 06/15/2018
Feb 26 18:45:07 cbs-qa-minio-01 kernel: [2397829.763889] Call Trace:
Feb 26 18:45:07 cbs-qa-minio-01 kernel: [2397829.763892]
Missing
fs.json
means that the file was deleted by someone i.e perhaps its the client itself sending a DELETE upon a failure?I checked both client and minio http logs. There is no delete call on either sender or receiver side.
It is also possible that during rename of
fs.json
we got an error from the backend - but we make NAS gateway resilient to ignore that because FS is allowed to work withoutfs.json
.This is more likely. Would there be a error or log message that would be generated in that case?
@harshavardhana
Any suggestions on where the fs.json files that potentially failed rename operations files might be left sitting?
I checked under the bucket directory (/data/fst03) and under the metadata directory (/data/.minio.sys) for possibly left out files from rename failure but come up blank.
Also, are the logs I have provided so far sufficient or do I need to upload something more?
Thanks.
Closing this issue as stale
I am testing/evaluating minio server under a single node configuration for my on-prem repository of large number of 4MB objects on a RAID enabled file system with 400 TB of space. It is a plain filesystem setup, but my node and the underneath filesystem are built for performance and scalability.
My minio version is VERSION: 2019-01-10T00:21:20Z and I am using libcurl s3 interface to populate these objects.
I have uploaded tens of million of object on this setup successfully, buy I occasionally notice some objects are missing their metadata when I download those objects.
On further analysis, I realized that file containing the object metadata is entirely missing on the underneath backend filesystem. See below.
root@cbs-qa-minio-01:/opt/minio/ajay# ls -l /data/fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data -rw-r--r-- 1 root root 1953354 Feb 5 09:14 /data/fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data root@cbs-qa-minio-01:/opt/minio/ajay# ls -l /data/.minio.sys/buckets/fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data ls: cannot access '/data/.minio.sys/buckets/fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data': No such file or directory
So I checked the minio http logs and also the minio stdout messages that I capture for these objects missing metadata.
What I notice is a pattern where some of the put request take longer than usual time for minio server to respond back (I have seen response time up-to 15 min in some cases). The response in most of these cases is “internal error”.
Client thread gives up after 5 min of inactivity, closes the connection and retries in a separate thread on a new connection. The successful upload finishes successfully as seen from from client side.
Based on the server logs, it seems the second request (retry of the timedout request) finishes before the first request which is still stuck somewhere in the server queue. Later after 10-15 min, the first request finishes and respond with internal error. But the response gets an connection reset error as the client has long closed the first connection.
It seems to me this first request which finishes last somehow invalidates the successful upload of the same object which finished earlier on a retry from the client.
I am expecting that if a particular object PUT has been returned a OK by minio to the client, it should return with all of its metadata during GET.
If there are any race conditions that occur from retires due to network shakiness or server slowness, they should be handled correctly by the server.
Current Behavior
objected download was missing its metadata that was provided during the upload. Data remained oK.
Possible Solution
Steps to Reproduce (for bugs)
I am listing the server log sequence as seen by mnio.
minio http log:
[REQUEST objectAPIHandlers.PutObjectHandler-fm] [154935716.226429] [2019-02-05 08:59:22 +0000] PUT /fst03/01d/ay3e02WfEW63K92jDMB42iyhOvJ7446fzKnMvxOzs2A/V1/7b8223bf-3f0c-504e-b50c-ea3716119f85/snap_156/000008a9.data Host: cbs-qa-minio-01.rtp.openenglab.netapp.com X-Amz-Date: 20190205T085922Z X-Amz-Meta-Md5: yrt2ObJZJj1tEEZJ0VchkA== Content-Length: 1953354 Content-Md5: yrt2ObJZJj1tEEZJ0VchkA== X-Amz-Content-Sha256: 76e50624bc4e4a1d3e7f3364b510cdedfe5d954baaeeb1669502f4f298364330 User-Agent: APN/1.0 NetApp/1.0 Cloud_Backup_Service/0.5 Accept: / Authorization: AWS4-HMAC-SHA256 Credential=O33UJS0EK838YTFPB010/20190205/us-east-1/s3/aws4_request,SignedHeaders=content-md5;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-meta-md5,Signature=41bfbeecf237f4d600e5f0865194bef2890d47d 11aeeaf4669816fbc1af5564e
[RESPONSE] [154935716.226429] [2019-02-05 09:14:20 +0000] 500 Internal Server Error X-Amz-Request-Id: 15806C2156F70B1A X-Minio-Deployment-Id: 1b82d251-6af6-4f89-9fe6-bacd439448bf Server: Minio/RELEASE.2019-01-10T00-21-20Z (linux; amd64) Accept-Ranges: bytes Content-Type: application/xml Vary: Origin X-Xss-Protection: 1; mode=block Content-Security-Policy: block-all-mixed-contentInternalError
InternalError