Open shuaichang opened 1 year ago
Also just to add some more info, per suggested by @liulanzheng offline, the following diff + overlaybd rebuild fixed the issue
Verified that 0.6.12 fixed the issue, please feel free to close the issue, thank you very much @liulanzheng for making such a fix!
What happened in your environment?
We found a potential overlaybd bug that it returned incorrect data during networking was down. This could lead to application failures, in our case is Java failed to load class
What did you expect to happen?
When networking is down, the class loading should be completely blocked until the network recovers. However, we currently see "Exception: java.lang.NoClassDefFoundError" and " error reading zip file" after retrying for 3+ minutes.
We suspect there's a bug in overlaybd that it returned some unexpected result but instead it should block until networking is recovered. given the following experiments we did:
systemctl stop overlaybd-tcmu
, after whichjar
command would actually hang forever until overlaybd-tcmu recoverHow can we reproduce it?
Step 1, build, convert and push a repro image using the following Dockerfile
Step 2: rpull and bash into the container
After several minutes, we see "error reading zip file" error
root@ip-10-0-0-134:/# jar vft ./example.java.helloworld/Main.jar /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar: error reading zip file /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar: error reading zip file Exception in thread "main" Exception: java.lang.NoClassDefFoundError thrown from the UncaughtExceptionHandler in thread "main"