Open mcregan23 opened 6 months ago
If there is any other information you need regarding this issue, please let me know and I will be happy to provide any other context/information needed.
This is by design; the Windows Containers code has no way to determine if a container died to an out of memory condition, because of the current underlying implementation:
The underlying implementation is in the shim, the actual "runtime" component that manages the lifecycle of containers. It uses epoll
to find out from the kernel when an OOM condition occurs:
It's possible this could be supported in the future with the 'containerd' implementation of Windows containers (where we call to a equivalent 'shim' to run containers instead of driving the OS directly, as today Windows does not use a shim, unlike Linux containers); however the equivalent code does not implement any sort of OOM notification as required by containerd:
Searching the repository, I don't see any issues related to an OOM event; I would suggest creating one (keeping in mind that this may just not be possible on Windows, I don't have the expertise there to say).
Thank you for the quick response and insight! I'll mark this as closed.
I think this is worth keeping open to track the feature, though an issue should be opened on hcsshim to provide the underlying functionality.
Description
I noticed an issue where a process running within a docker container on a Windows VM reported an Out of Memory error according to the logs emitted by docker, however the container itself reported as not exiting due to an OOM error (OOMKilled flag was false when running
docker inspect <containerid>
) and the exit code reported from the container never matched the expected exit code for an OOM error (Exit code 137). When trying to see if linux had the same behavior when it comes to the process running inside of the container reporting an OOM error, the docker container reported that it was killed due to OOM (OOMKilled flag was true when runningdocker inspect <containerid>
and container reported exit code of 137). I always saw the same behavior in regards to the container reporting that itself wasn't killed due to OOM while the process was reporting OOM errors.Reproduce
Windows Reproduction Steps:
test.ps1 contents
docker build -t <yourtaghere> .
docker run -m 256 <tag here>
docker stats
docker inspect <containeridhere>
docker inspect
command and process running within the container reported stopping due to an out of memory error.Linux Reproduction Steps:
These steps are relatively short and simpler than the windows steps.
CMD ["python3", "-c", "foo=' '10241024*512; import time; time.sleep(10)"]
docker info
Additional Info
Process logs from Windows:
docker inspect
logs: docker inspect exciting_rubin [ { "Id": "6fe653aeccb41201d50629356f057df650195f63bb16d0070fc31eefeb93a29b", "Created": "2024-03-22T15:50:54.4760729Z", "Path": "powershell.exe", "Args": [ ".\test.ps1" ], "State": { "Status": "exited", "Running": false, "Paused": false, "Restarting": false, "OOMKilled": false,should be true
"Dead": false, "Pid": 0, "ExitCode": 3221225473, "Error": "", "StartedAt": "2024-03-22T15:50:56.2195812Z", "FinishedAt": "2024-03-22T16:03:36.7360532Z" },