kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.42k stars 4.88k forks source link

Minikube mount fails with "Unknown error 526" #1562

Closed cu12 closed 7 years ago

cu12 commented 7 years ago

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

Minikube version (use minikube version):

Environment:

What happened: Mounted a directory with gazillions of folders/files (69893/315668) minikube mount ../:/src

What you expected to happen: Files to be listed, but got

$ ls -la /src
ls: reading directory '/src': Unknown error 526

How to reproduce it (as minimally and precisely as possible): Not sure yet, but I guess it has to do something with the amount of files and directories as mounting a simple directory inside the structure works fine.

Anything else do we need to know: mount part of --v=10

Found binary path at /usr/local/bin/minikube
Launching plugin server for driver virtualbox
Plugin server listening at address 127.0.0.1:64086
() Calling .GetVersion
Using API Version  1
() Calling .SetConfigRaw
() Calling .GetMachineName
(minikube) Calling .DriverName
Mounting ../ into /src on the minikubeVM
This daemon process needs to stay alive for the mount to still be accessible...
ufs starting
Found binary path at /usr/local/bin/minikube
Launching plugin server for driver virtualbox
Plugin server listening at address 127.0.0.1:64091
() Calling .GetVersion
Using API Version  1
() Calling .SetConfigRaw
() Calling .GetMachineName
(minikube) Calling .GetSSHHostname
(minikube) Calling .GetSSHPort
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/sea-you/.minikube/machines/minikube/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@127.0.0.1 -o IdentitiesOnly=yes -i /Users/sea-you/.minikube/machines/minikube/id_rsa -p 63474] /usr/bin/ssh <nil>}
About to run SSH command:
sudo umount /src;
SSH cmd err, output: <nil>:
(minikube) Calling .GetSSHHostname
(minikube) Calling .GetSSHPort
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHKeyPath
(minikube) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /Users/sea-you/.minikube/machines/minikube/id_rsa (-rw-------)
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none docker@127.0.0.1 -o IdentitiesOnly=yes -i /Users/sea-you/.minikube/machines/minikube/id_rsa -p 63474] /usr/bin/ssh <nil>}
About to run SSH command:

sudo mkdir -p /src || true;
sudo mount -t 9p -o trans=tcp -o port=64088 -o uid=1001 -o gid=1001 192.168.99.1 /src;
sudo chmod 775 /src;
2017/06/07 15:15:47 connected
2017/06/07 15:15:47 >>> 192.168.99.100:52392 Tversion tag 65535 msize 8192 version '9P2000.L'
2017/06/07 15:15:47 <<< 192.168.99.100:52392 Rversion tag 65535 msize 8192 version '9P2000'
2017/06/07 15:15:47 >>> 192.168.99.100:52392 Tattach tag 1 fid 0 afid 4294967295 uname 'nobody' nuname 0 aname ''
2017/06/07 15:15:47 <<< 192.168.99.100:52392 Rattach tag 1 aqid (1095ea 6832ed38 'd')
2017/06/07 15:15:47 >>> 192.168.99.100:52392 Tstat tag 1 fid 0
2017/06/07 15:15:47 <<< 192.168.99.100:52392 Rstat tag 1 st ('..' 'sea-you' '20' '' q (1095ea 6832ed38 'd') m d775 at 0 mt 1496396787 l 4318 t 0 d 0 ext )
2017/06/07 15:15:47 >>> 192.168.99.100:52392 Tstat tag 1 fid 0
2017/06/07 15:15:47 <<< 192.168.99.100:52392 Rstat tag 1 st ('..' 'sea-you' '20' '' q (1095ea 6832ed38 'd') m d775 at 0 mt 1496396787 l 4318 t 0 d 0 ext )
2017/06/07 15:15:47 >>> 192.168.99.100:52392 Twstat tag 1 fid 0 st ('' '' '' '' q (ffffffffffffffff ffffffff 'daAltL') m d775 at 4294967295 mt 4294967295 l 18446744073709551615 t 65535 d 4294967295 ext )
2017/06/07 15:15:47 <<< 192.168.99.100:52392 Rwstat tag 1
SSH cmd err, output: <nil>:

the error:

2017/06/07 15:15:50 >>> 192.168.99.100:52392 Tstat tag 1 fid 0
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rstat tag 1 st ('..' 'sea-you' '20' '' q (1095ea 6832ed38 'd') m d775 at 0 mt 1496396787 l 4318 t 0 d 0 ext )
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Twalk tag 1 fid 0 newfid 1
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rwalk tag 1
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Topen tag 1 fid 1 mode 0
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Ropen tag 1 qid (1095ea 6832ed38 'd') iounit 0
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Tstat tag 1 fid 0
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rstat tag 1 st ('..' 'sea-you' '20' '' q (1095ea 6832ed38 'd') m d775 at 0 mt 1496396787 l 4318 t 0 d 0 ext )
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Tread tag 1 fid 1 offset 0 count 8168
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rread tag 1 count 8133
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Tread tag 1 fid 1 offset 8133 count 35
**2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rerror tag 1 ename 'too small read size for dir entry' ecode 0**
2017/06/07 15:15:50 >>> 192.168.99.100:52392 Tclunk tag 1 fid 1
2017/06/07 15:15:50 <<< 192.168.99.100:52392 Rclunk tag 1
KallynGowdy commented 7 years ago

I'm running into the same issue on Windows 10, using Hyper-V.

Environment:

Steps to reproduce: To reproduce, see this repository.

In the repo, there are two branches (master, and failing). The only difference is the failing branch has one more copy of the lorem ipsum text than the master branch.

There appears to be a relationship between the number of files, length of the filenames, and the size on disk. For example, if I shortened the filename lengths I would need more files in order to produce the same error, despite the fact that they took more total disk space. This makes me think that there's some sort of relationship between the size of the folder contents and the size of them represented on disk.

i.e: sizeOnDisk = fileSize + fileMetadataSize

Passing Scenario:

Broken Scenario:

For reference, it appears that the error is being emitted from go9p/ufs.go#L384.

The key section being:

if count == 0 && int(tc.Offset) < len(fid.dirents) && len(fid.dirents) > 0 {
    req.RespondError(&Error{"too small read size for dir entry", EINVAL})
    return
}
aaron-prindle commented 7 years ago

Thank you very much for the detailed bug report. The examples in the github repo were really useful in testing this. This 9p ufs implementation is adapted from https://github.com/rminnich/go9p/ with some small tweaks for cross-platform support. So bear with me as I will need to read over this. The tests below involve running ls from within the mounted /test directory from the repo.

The failure condition for this the failing branch appears to be going through: The second switch case, len(fid.dirents[tc.Offset:]) > int(tc.Count): https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L366-L367

This 3rd nested if statement, nextend > 0: https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L368-L369

and then hitting the error you mentioned above: https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L383-L386

A gist of some debug output for the failing case is here: https://gist.github.com/aaron-prindle/1a6e09d1649bd50f646b19c16c07dcfe

It seems that an initial read is performed for the directory

In the successful case, the flow is: The default switch case: https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L368-L369

The first nested if statement (but none of the nested if statements trigger): https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L374

A gist of some debug output for the successful case is here: https://gist.github.com/aaron-prindle/918a6ff79d697674ae8f20583c3e7235

From the implementation it seems that this error condition is designed to catch when the tc.Offset smaller than fid.dirents. fid.dirents is byte[] where each entry in the list is a single character and the whole e list contains all of the filenames concatinated with some metadata: fid.dirents example gist: https://gist.github.com/aaron-prindle/bb7b415901a71f10b3bbf7613efd9554

This value grows as you put more files/longer-file-names in a directory. tc.Offset (and tc.Count) seem to be a parameters that are passed into the call and as such cannot be modified to be "larger". From investigating the code it seems that it is possible that when this statement is triggered: https://github.com/kubernetes/minikube/blob/master/third_party/go9p/ufs.go#L377 the count ends up equaling 0 even though the 0 value arrived by the subtracting and it is not a sentinel as the 0 value is usually used throughout.

aaron-prindle commented 7 years ago

There's actually a flag option for 9p with mount, msize, that can change the message size allowable for mount. Configuring this flag should allow you to use arbitrary sized/named directories. The PR for this is here, you can use it with minikube mount --msize=<the number of bytes to use for 9p packet payload> The new PR is here:

1705

veqryn commented 7 years ago

I am running into this too.

Minikube version: v0.20.0 Environment: OS: Windows 10 Pro (Anniversary Edition) VM Driver: hyperv ISO version: minikube-v0.20.0.iso

I have a directory with 121 files in it, and it shows as empty with error 526.
The directory is actually this project: https://github.com/Shopify/sarama , and none of the file names are long or have weird characters.

If I delete all the files and create empty files one by one, I can see the error is caused at exactly 102 files (ie: I can ls and see 101 blank files just fine, but if I add one more blank file I get error 526).

So strange! I'm glad you've already fixed it and I'm looking forward to the next release!

veqryn commented 7 years ago

Would this be able to make it into a v0.20.1 release?

r2d4 commented 7 years ago

@veqryn I think we're planning a release today

aaron-prindle commented 7 years ago

This should be fixed with https://github.com/kubernetes/minikube/pull/1705. Closing.

artemkozlenkov commented 4 years ago

Hello everybody, I'm running into this on ls -la /node_modules for the mounted folder inside the minikube cluster, could someone please specify how big should be --msize to get rid of this "error 526"??

Regards!