pinned_memory_manager Killed

shimoshida commented 2 years ago

Description I want to deploy Triton server via Azure Kubernetes Service. Target node is ND96asr v4 which is equipped with 8 A100 GPUs. Triton server without loading any models cannot startup successfully.

Triton Information

triton: nvcr.io/nvidia/tritonserver:21.07-py3
azure: ND96asr v4

To Reproduce

prepare cluster To create cluster you follow the procedure of the azure gpu-cluster article https://docs.microsoft.com/ja-jp/azure/aks/gpu-cluster.

az aks nodepool add \
   --resource-group myResourceGroup \
   --cluster-name myAKSCluster \
   --name gpunp \
   --node-count 1 \
   --node-vm-size Standard_NC6 \
   --node-taints sku=gpu:NoSchedule \
   --aks-custom-headers UseGPUDedicatedVHD=true,usegen2vm=true

deploy via deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-triton-ft
namespace: modules-gpt3-6b
spec:
replicas: 1
selector:
matchLabels:
  app: sample
template:
metadata:
  labels:
    app: sample
spec:
  containers:
  - name: sample
    image: nvcr.io/nvidia/tritonserver:21.07-py3
    command: ["/bin/sh"]
    args: ["-c", "while true; do sleep 10;done"]
  tolerations:
  - key: "sku"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

login the pod and run mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/

confirm outputs

root@sample2-7cb48985d9-lgzfc:/opt/tritonserver# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/
I0404 12:44:49.449929 92 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450370 92 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450406 92 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450431 92 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450454 92 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450483 92 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450504 92 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0404 12:44:49.450531 92 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0404 12:44:50.485665 92 libtorch.cc:998] TRITONBACKEND_Initialize: pytorch
I0404 12:44:50.485729 92 libtorch.cc:1008] Triton TRITONBACKEND API version: 1.4
I0404 12:44:50.485738 92 libtorch.cc:1014] 'pytorch' TRITONBACKEND API version: 1.4
2022-04-04 12:44:51.056099: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0404 12:44:51.247146 92 tensorflow.cc:2169] TRITONBACKEND_Initialize: tensorflow
I0404 12:44:51.247200 92 tensorflow.cc:2179] Triton TRITONBACKEND API version: 1.4
I0404 12:44:51.247209 92 tensorflow.cc:2185] 'tensorflow' TRITONBACKEND API version: 1.4
I0404 12:44:51.247216 92 tensorflow.cc:2209] backend configuration:
{}
I0404 12:44:51.249647 92 onnxruntime.cc:1970] TRITONBACKEND_Initialize: onnxruntime
I0404 12:44:51.249678 92 onnxruntime.cc:1980] Triton TRITONBACKEND API version: 1.4
I0404 12:44:51.249687 92 onnxruntime.cc:1986] 'onnxruntime' TRITONBACKEND API version: 1.4
I0404 12:44:51.343681 92 openvino.cc:1193] TRITONBACKEND_Initialize: openvino
I0404 12:44:51.343707 92 openvino.cc:1203] Triton TRITONBACKEND API version: 1.4
I0404 12:44:51.343715 92 openvino.cc:1209] 'openvino' TRITONBACKEND API version: 1.4
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sample2-7cb48985d9-lgzfc exited on signal 9 (Killed).
--------------------------------------------------------------------------

When startup without mpirun, Killed is observed.

root@sample2-7cb48985d9-lgzfc:/opt/tritonserver# tritonserver --model-repository=/a
I0404 12:57:33.566547 197 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566814 197 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566832 197 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566844 197 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566856 197 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566870 197 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566880 197 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0404 12:57:33.566893 197 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0404 12:57:34.057968 197 libtorch.cc:998] TRITONBACKEND_Initialize: pytorch
I0404 12:57:34.058020 197 libtorch.cc:1008] Triton TRITONBACKEND API version: 1.4
I0404 12:57:34.058025 197 libtorch.cc:1014] 'pytorch' TRITONBACKEND API version: 1.4
2022-04-04 12:57:34.267157: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0404 12:57:34.351845 197 tensorflow.cc:2169] TRITONBACKEND_Initialize: tensorflow
I0404 12:57:34.351893 197 tensorflow.cc:2179] Triton TRITONBACKEND API version: 1.4
I0404 12:57:34.351908 197 tensorflow.cc:2185] 'tensorflow' TRITONBACKEND API version: 1.4
I0404 12:57:34.351912 197 tensorflow.cc:2209] backend configuration:
{}
I0404 12:57:34.353170 197 onnxruntime.cc:1970] TRITONBACKEND_Initialize: onnxruntime
I0404 12:57:34.353190 197 onnxruntime.cc:1980] Triton TRITONBACKEND API version: 1.4
I0404 12:57:34.353200 197 onnxruntime.cc:1986] 'onnxruntime' TRITONBACKEND API version: 1.4
I0404 12:57:34.376199 197 openvino.cc:1193] TRITONBACKEND_Initialize: openvino
I0404 12:57:34.376221 197 openvino.cc:1203] Triton TRITONBACKEND API version: 1.4
I0404 12:57:34.376225 197 openvino.cc:1209] 'openvino' TRITONBACKEND API version: 1.4
Killed

Expected behavior startup successfully. The following output is node with 1 gpu.

root@gpt1b:/workspace# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/a
I0404 11:55:52.082112 69 metrics.cc:290] Collecting metrics for GPU 0: Tesla V100-PCIE-16GB
I0404 11:55:52.375557 69 libtorch.cc:998] TRITONBACKEND_Initialize: pytorch
I0404 11:55:52.375599 69 libtorch.cc:1008] Triton TRITONBACKEND API version: 1.4
I0404 11:55:52.375605 69 libtorch.cc:1014] 'pytorch' TRITONBACKEND API version: 1.4
2022-04-04 11:55:52.524003: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0404 11:55:52.570841 69 tensorflow.cc:2169] TRITONBACKEND_Initialize: tensorflow
I0404 11:55:52.570874 69 tensorflow.cc:2179] Triton TRITONBACKEND API version: 1.4
I0404 11:55:52.570880 69 tensorflow.cc:2185] 'tensorflow' TRITONBACKEND API version: 1.4
I0404 11:55:52.570884 69 tensorflow.cc:2209] backend configuration:
{}
I0404 11:55:52.573942 69 onnxruntime.cc:1970] TRITONBACKEND_Initialize: onnxruntime
I0404 11:55:52.573973 69 onnxruntime.cc:1980] Triton TRITONBACKEND API version: 1.4
I0404 11:55:52.573979 69 onnxruntime.cc:1986] 'onnxruntime' TRITONBACKEND API version: 1.4
I0404 11:55:52.595485 69 openvino.cc:1193] TRITONBACKEND_Initialize: openvino
I0404 11:55:52.595508 69 openvino.cc:1203] Triton TRITONBACKEND API version: 1.4
I0404 11:55:52.595513 69 openvino.cc:1209] 'openvino' TRITONBACKEND API version: 1.4
I0404 11:55:53.062644 69 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f945c000000' with size 268435456
I0404 11:55:53.063056 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0404 11:55:53.063869 69 server.cc:504]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0404 11:55:53.063923 69 server.cc:543]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| tensorrt    | <built-in>                                                      | {}     |
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0404 11:55:53.063941 69 server.cc:586]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0404 11:55:53.064038 69 tritonserver.cc:1718]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.12.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /a                                                                                                                                                                                     |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0404 11:55:53.065759 69 grpc_server.cc:4072] Started GRPCInferenceService at 0.0.0.0:8001
I0404 11:55:53.065984 69 http_server.cc:2795] Started HTTPService at 0.0.0.0:8000
I0404 11:55:53.107932 69 sagemaker_server.cc:134] Started Sagemaker HTTPService at 0.0.0.0:8080
I0404 11:55:53.160626 69 http_server.cc:162] Started Metrics Service at 0.0.0.0:8002

tanmayv25 commented 2 years ago

21.07 looks like little old. Can you reproduce the issue with the latest triton release(22.03)? Can you share the output of dmesg so that we can know what was the reason tritonserver got killed?

shimoshida commented 2 years ago

@tanmayv25 thanks for the reply. I have investigated it with 21.07 and latest Triton. I confirm that MIG is disabled correctly via nvidia-smi.

Triton 22.03

logs of Triton startup(workspace is empty dir)

root@sample-triton-latest-74fccc696d-58tgs:/opt/tritonserver# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/workspace
I0406 05:21:45.088080 4518 libtorch.cc:1309] TRITONBACKEND_Initialize: pytorch
I0406 05:21:45.088196 4518 libtorch.cc:1319] Triton TRITONBACKEND API version: 1.8
I0406 05:21:45.088200 4518 libtorch.cc:1325] 'pytorch' TRITONBACKEND API version: 1.8
2022-04-06 05:21:45.376048: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-04-06 05:21:45.411645: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0406 05:21:45.411737 4518 tensorflow.cc:2176] TRITONBACKEND_Initialize: tensorflow
I0406 05:21:45.411762 4518 tensorflow.cc:2186] Triton TRITONBACKEND API version: 1.8
I0406 05:21:45.411768 4518 tensorflow.cc:2192] 'tensorflow' TRITONBACKEND API version: 1.8
I0406 05:21:45.411772 4518 tensorflow.cc:2216] backend configuration:
{}
I0406 05:21:45.414874 4518 onnxruntime.cc:2319] TRITONBACKEND_Initialize: onnxruntime
I0406 05:21:45.414905 4518 onnxruntime.cc:2329] Triton TRITONBACKEND API version: 1.8
I0406 05:21:45.414909 4518 onnxruntime.cc:2335] 'onnxruntime' TRITONBACKEND API version: 1.8
I0406 05:21:45.414912 4518 onnxruntime.cc:2365] backend configuration:
{}
I0406 05:21:45.471530 4518 openvino.cc:1207] TRITONBACKEND_Initialize: openvino
I0406 05:21:45.471547 4518 openvino.cc:1217] Triton TRITONBACKEND API version: 1.8
I0406 05:21:45.471551 4518 openvino.cc:1223] 'openvino' TRITONBACKEND API version: 1.8
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sample-triton-latest-74fccc696d-58tgs exited on signal 9 (Killed).
--------------------------------------------------------------------------

dmesg

``` [ 75.605864] NET: Registered protocol family 17 [ 75.608391] Key type dns_resolver registered [ 75.616090] RAS: Correctable Errors collector initialized. [ 75.619121] IPI shorthand broadcast: enabled [ 75.621523] sched_clock: Marking stable (75592246121, 29266100)->(75684668700, -63156479) [ 75.622879] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 75.627575] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB) [ 75.630182] scsi 0:0:0:1: Attached scsi generic sg1 type 0 [ 75.633445] sd 0:0:0:1: [sdb] 6081740800 512-byte logical blocks: (3.11 TB/2.83 TiB) [ 75.633447] sd 0:0:0:1: [sdb] 4096-byte physical blocks [ 75.633485] sd 0:0:0:1: [sdb] Write Protect is off [ 75.633486] sd 0:0:0:1: [sdb] Mode Sense: 0f 00 10 00 [ 75.633559] sd 0:0:0:1: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 75.633949] sd 0:0:0:0: [sda] 4096-byte physical blocks [ 75.636878] registered taskstats version 1 [ 75.636985] sr 0:0:0:2: [sr0] scsi-1 drive [ 75.636987] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 75.637145] sdb: sdb1 [ 75.643724] sd 0:0:0:0: [sda] Write Protect is off [ 75.646419] Loading compiled-in X.509 certificates [ 75.647055] Loaded X.509 cert 'Build time autogenerated kernel key: d4cf42da249c8ddd4ece66c768885338dca13669' [ 75.650742] sd 0:0:0:0: [sda] Mode Sense: 0f 00 10 00 [ 75.653970] Loaded X.509 cert 'Canonical Ltd. Live Patch Signing: 14df34d1a87cf37625abec039ef2bf521249b969' [ 75.661103] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 75.661304] sd 0:0:0:1: [sdb] Attached SCSI disk [ 75.663161] Loaded X.509 cert 'Canonical Ltd. Kernel Module Signing: 88f752e560a1e0737e31163a466ad7b70a850c19' [ 75.690900] blacklist: Loading compiled-in revocation X.509 certificates [ 75.694300] Loaded X.509 cert 'Canonical Ltd. Secure Boot Signing: 61482aa2830d0ab2ad5af10b7250da9033ddcef0' [ 75.699447] zswap: loaded using pool lzo/zbud [ 75.702264] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 75.706166] GPT:62916607 != 268435455 [ 75.708513] GPT:Alternate GPT header not at the end of the disk. [ 75.708940] Key type ._fscrypt registered [ 75.711775] GPT:62916607 != 268435455 [ 75.712879] sr 0:0:0:2: Attached scsi CD-ROM sr0 [ 75.712944] sr 0:0:0:2: Attached scsi generic sg2 type 5 [ 75.714175] Key type .fscrypt registered [ 75.716393] GPT: Use GNU Parted to correct GPT errors. [ 75.716399] sda: sda1 sda14 sda15 [ 75.724033] Key type big_key registered [ 75.732128] Key type encrypted registered [ 75.734826] AppArmor: AppArmor sha1 policy hashing enabled [ 75.738252] integrity: Loading X.509 certificate: UEFI:db [ 75.741161] integrity: Loaded X.509 cert 'Microsoft Windows Production PCA 2011: a92902398e16c49778cd90f99e4f9ae17c55af53' [ 75.746957] integrity: Loading X.509 certificate: UEFI:MokListRT (MOKvar table) [ 75.751188] integrity: Loaded X.509 cert 'Canonical Ltd. Master Certificate Authority: ad91990bc22ab1f517048c23b6655a268e345a63' [ 75.757418] ima: No TPM chip found, activating TPM-bypass! [ 75.760412] ima: Allocated hash algorithm: sha1 [ 75.763133] ima: No architecture policies found [ 75.767228] evm: Initialising EVM extended attributes: [ 75.770068] evm: security.selinux [ 75.772114] evm: security.SMACK64 [ 75.774224] evm: security.SMACK64EXEC [ 75.776412] evm: security.SMACK64TRANSMUTE [ 75.778886] evm: security.SMACK64MMAP [ 75.781294] evm: security.apparmor [ 75.783380] evm: security.ima [ 75.785276] evm: security.capability [ 75.787376] evm: HMAC attrs: 0x1 [ 75.787386] sd 0:0:0:0: [sda] Attached SCSI disk [ 75.789733] PM: Magic number: 2:399:893 [ 75.794464] pci_bus 0004:00: hash matches [ 75.797026] processor cpu45: hash matches [ 75.799534] memory memory1119: hash matches [ 75.802419] rtc_cmos 00:02: setting system clock to 2022-04-05T16:54:11 UTC (1649177651) [ 76.530760] nvme nvme0: 32/0/0 default/read/poll queues [ 76.546336] nvme nvme2: 32/0/0 default/read/poll queues [ 76.546529] nvme nvme1: 32/0/0 default/read/poll queues [ 76.561861] nvme nvme4: 32/0/0 default/read/poll queues [ 76.561868] nvme nvme3: 32/0/0 default/read/poll queues [ 76.577426] nvme nvme5: 32/0/0 default/read/poll queues [ 76.577681] nvme nvme6: 32/0/0 default/read/poll queues [ 76.593174] nvme nvme7: 32/0/0 default/read/poll queues [ 76.674283] Freeing unused decrypted memory: 2040K [ 76.678780] Freeing unused kernel image memory: 2496K [ 76.681395] Write protecting the kernel read-only data: 26624k [ 76.685609] Freeing unused kernel image memory: 2000K [ 76.688863] Freeing unused kernel image memory: 1408K [ 76.702057] x86/mm: Checked W+X mappings: passed, no W+X pages found. [ 76.705251] Run /init as init process [ 76.809976] hv_utils: Registering HyperV Utility Driver [ 76.816253] hv_vmbus: registering driver hv_utils [ 76.820038] hv_utils: Shutdown IC version 3.2 [ 76.823035] hv_utils: Heartbeat IC version 3.0 [ 76.825519] hv_utils: TimeSync IC version 4.0 [ 76.829602] hv_vmbus: registering driver hyperv_fb [ 76.829633] hidraw: raw HID events driver (C) Jiri Kosina [ 76.832767] checking generic (40000000 300000) vs hw (40000000 300000) [ 76.835794] fb0: switching to hyperv_fb from EFI VGA [ 76.838620] Console: switching to colour dummy device 80x25 [ 76.841326] hyperv_fb: Screen resolution: 1152x864, Color depth: 32 [ 76.845179] Console: switching to colour frame buffer device 144x54 [ 76.845335] hv_vmbus: registering driver hv_netvsc [ 76.852425] hv_vmbus: registering driver hyperv_keyboard [ 76.855257] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0004:00/VMBUS:00/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio0/input/input0 [ 76.863033] hv_vmbus: registering driver hid_hyperv [ 76.865800] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input1 [ 76.869933] hid 0006:045E:0621.0001: input: VIRTUAL HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on [ 76.874780] cryptd: max_cpu_qlen set to 1000 [ 76.884248] AVX2 version of gcm_enc/dec engaged. [ 76.885704] mlx5_core 0101:00:00.0: firmware version: 20.28.4000 [ 76.886825] AES CTR mode by8 optimization enabled [ 77.194847] mlx5_core 0102:00:00.0: firmware version: 20.28.4000 [ 77.490984] mlx5_core 0103:00:00.0: firmware version: 20.28.4000 [ 77.788648] mlx5_core 0104:00:00.0: firmware version: 20.28.4000 [ 78.086568] mlx5_core 0105:00:00.0: firmware version: 20.28.4000 [ 78.210268] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: VF slot 1 added [ 78.214189] hv_pci 4bcd3f84-7657-4dfc-a455-e8d9c0291426: PCI VMBus probing: Using version 0x10003 [ 78.223776] hv_pci 4bcd3f84-7657-4dfc-a455-e8d9c0291426: PCI host bridge to bus 7657:00 [ 78.227547] pci_bus 7657:00: root bus resource [mem 0xff0100000-0xff01fffff window] [ 78.232736] pci 7657:00:02.0: [15b3:1016] type 00 class 0x020000 [ 78.242811] pci 7657:00:02.0: reg 0x10: [mem 0xff0100000-0xff01fffff 64bit pref] [ 78.272826] pci 7657:00:02.0: enabling Extended Tags [ 78.281713] pci 7657:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 7657:00:02.0 (capable of 63.008 Gb/s with 8 GT/s x8 link) [ 78.293599] pci 7657:00:02.0: BAR 0: assigned [mem 0xff0100000-0xff01fffff 64bit pref] [ 78.303844] mlx5_core 7657:00:02.0: firmware version: 14.30.1210 [ 78.313237] mlx5_core 7657:00:02.0: handle_hca_cap:551:(pid 686): log_max_qp value in current profile is 18, changing it to HCA capability limit (12) [ 78.384490] mlx5_core 0106:00:00.0: firmware version: 20.28.4000 [ 78.701673] mlx5_core 0107:00:00.0: firmware version: 20.28.4000 [ 78.998568] mlx5_core 0108:00:00.0: firmware version: 20.28.4000 [ 79.297202] mlx5_core 7657:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 79.476240] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: VF registering: eth1 [ 79.480481] mlx5_core 7657:00:02.0 eth1: joined to eth0 [ 79.483461] mlx5_core 7657:00:02.0 eth1: Disabling LRO, not supported in legacy RQ [ 79.495484] mlx5_core 7657:00:02.0 eth1: Disabling LRO, not supported in legacy RQ [ 79.500087] mlx5_core 7657:00:02.0 enP30295s1: renamed from eth1 [ 79.504322] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0 [ 80.708821] raid6: avx2x4 gen() 29177 MB/s [ 80.756819] raid6: avx2x4 xor() 14440 MB/s [ 80.804819] raid6: avx2x2 gen() 30067 MB/s [ 80.852818] raid6: avx2x2 xor() 18394 MB/s [ 80.900823] raid6: avx2x1 gen() 17277 MB/s [ 80.948821] raid6: avx2x1 xor() 15411 MB/s [ 80.996820] raid6: sse2x4 gen() 15107 MB/s [ 81.044820] raid6: sse2x4 xor() 8727 MB/s [ 81.092821] raid6: sse2x2 gen() 14898 MB/s [ 81.140819] raid6: sse2x2 xor() 9430 MB/s [ 81.188819] raid6: sse2x1 gen() 7167 MB/s [ 81.236818] raid6: sse2x1 xor() 7736 MB/s [ 81.239306] raid6: using algorithm avx2x2 gen() 30067 MB/s [ 81.242263] raid6: .... xor() 18394 MB/s, rmw enabled [ 81.245025] raid6: using avx2x2 recovery algorithm [ 81.249288] xor: automatically using best checksumming function avx [ 81.254615] async_tx: api initialized (async) [ 81.304657] Btrfs loaded, crc32c=crc32c-intel [ 81.428125] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) [ 83.721777] systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid) [ 83.732357] systemd[1]: Detected virtualization microsoft. [ 83.735669] systemd[1]: Detected architecture x86-64. [ 83.796607] systemd[1]: Set hostname to . [ 83.845887] systemd[1]: Initializing machine ID from random generator. [ 83.849317] systemd[1]: Installed transient /etc/machine-id file. [ 85.187369] systemd[1]: Unnecessary job for sys-devices-virtual-misc-vmbus\x21hv_fcopy.device was removed. [ 85.192403] systemd[1]: Unnecessary job for sys-devices-virtual-misc-vmbus\x21hv_vss.device was removed. [ 85.197544] systemd[1]: Reached target Swap. [ 85.201989] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point. [ 85.269489] EXT4-fs (sda1): re-mounted. Opts: discard [ 85.307801] Loading iSCSI transport class v2.0-870. [ 85.354016] iscsi: registered transport (tcp) [ 85.387981] RPC: Registered named UNIX socket transport module. [ 85.391347] RPC: Registered udp transport module. [ 85.394141] RPC: Registered tcp transport module. [ 85.396785] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 85.444778] systemd-journald[1391]: Received request to flush runtime journal from PID 1 [ 85.449955] iscsi: registered transport (iser) [ 86.064179] hv_vmbus: registering driver hv_balloon [ 86.064613] hv_balloon: Using Dynamic Memory protocol version 2.0 [ 86.114685] mlx5_core 7657:00:02.0 enP30295s1: Disabling LRO, not supported in legacy RQ [ 86.592473] hv_utils: KVP IC version 4.0 [ 87.298286] audit: type=1400 audit(1649177662.927:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=1876 comm="apparmor_parser" [ 87.310375] audit: type=1400 audit(1649177662.939:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=1874 comm="apparmor_parser" [ 87.310379] audit: type=1400 audit(1649177662.939:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-cgns" pid=1874 comm="apparmor_parser" [ 87.310381] audit: type=1400 audit(1649177662.939:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-mounting" pid=1874 comm="apparmor_parser" [ 87.310383] audit: type=1400 audit(1649177662.939:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-nesting" pid=1874 comm="apparmor_parser" [ 87.310563] audit: type=1400 audit(1649177662.939:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=1877 comm="apparmor_parser" [ 87.310566] audit: type=1400 audit(1649177662.939:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=1877 comm="apparmor_parser" [ 87.310568] audit: type=1400 audit(1649177662.939:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=1877 comm="apparmor_parser" [ 87.340696] audit: type=1400 audit(1649177662.967:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=1878 comm="apparmor_parser" [ 87.362943] audit: type=1400 audit(1649177662.991:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/sbin/dhclient" pid=1875 comm="apparmor_parser" [ 88.421758] nvidia: loading out-of-tree module taints kernel. [ 88.421769] nvidia: module license 'NVIDIA' taints kernel. [ 88.421770] Disabling lock debugging due to kernel taint [ 88.464882] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 88.475267] nvidia-nvlink: Nvlink Core is being initialized, major device number 238 [ 88.475336] nvidia-nvswitch: Probing device 0005:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 88.475660] nvidia-nvswitch 0005:00:00.0: can't derive routing for PCI INT A [ 88.475661] nvidia-nvswitch 0005:00:00.0: PCI INT A: no GSI [ 89.758915] nvidia-nvswitch0: using MSI [ 90.069151] nvidia-nvswitch: Probing device 0006:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 90.069351] nvidia-nvswitch 0006:00:00.0: can't derive routing for PCI INT A [ 90.069352] nvidia-nvswitch 0006:00:00.0: PCI INT A: no GSI [ 91.279989] nvidia-nvswitch1: using MSI [ 91.584500] nvidia-nvswitch: Probing device 0007:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 91.584706] nvidia-nvswitch 0007:00:00.0: can't derive routing for PCI INT A [ 91.584707] nvidia-nvswitch 0007:00:00.0: PCI INT A: no GSI [ 92.889301] nvidia-nvswitch2: using MSI [ 93.204792] nvidia-nvswitch: Probing device 0008:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 93.205006] nvidia-nvswitch 0008:00:00.0: can't derive routing for PCI INT A [ 93.205007] nvidia-nvswitch 0008:00:00.0: PCI INT A: no GSI [ 94.541322] nvidia-nvswitch3: using MSI [ 94.852930] nvidia-nvswitch: Probing device 0009:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 94.853136] nvidia-nvswitch 0009:00:00.0: can't derive routing for PCI INT A [ 94.853137] nvidia-nvswitch 0009:00:00.0: PCI INT A: no GSI [ 96.027073] UDF-fs: INFO Mounting volume 'UDF Volume', timestamp 2022/04/06 00:00 (1000) [ 96.213666] nvidia-nvswitch4: using MSI [ 96.527839] nvidia-nvswitch: Probing device 000a:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 96.528032] nvidia-nvswitch 000a:00:00.0: can't derive routing for PCI INT A [ 96.528033] nvidia-nvswitch 000a:00:00.0: PCI INT A: no GSI [ 97.845519] nvidia-nvswitch5: using MSI [ 98.162696] nvidia 0001:00:00.0: can't derive routing for PCI INT A [ 98.162698] nvidia 0001:00:00.0: PCI INT A: no GSI [ 98.214230] nvidia 0002:00:00.0: can't derive routing for PCI INT A [ 98.214232] nvidia 0002:00:00.0: PCI INT A: no GSI [ 98.260614] nvidia 0003:00:00.0: can't derive routing for PCI INT A [ 98.260616] nvidia 0003:00:00.0: PCI INT A: no GSI [ 98.312934] nvidia 0004:00:00.0: can't derive routing for PCI INT A [ 98.312935] nvidia 0004:00:00.0: PCI INT A: no GSI [ 98.361160] nvidia 000b:00:00.0: can't derive routing for PCI INT A [ 98.361162] nvidia 000b:00:00.0: PCI INT A: no GSI [ 98.407849] nvidia 000c:00:00.0: can't derive routing for PCI INT A [ 98.407850] nvidia 000c:00:00.0: PCI INT A: no GSI [ 98.453516] nvidia 000d:00:00.0: can't derive routing for PCI INT A [ 98.453519] nvidia 000d:00:00.0: PCI INT A: no GSI [ 98.502160] nvidia 000e:00:00.0: can't derive routing for PCI INT A [ 98.502161] nvidia 000e:00:00.0: PCI INT A: no GSI [ 98.548922] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.57.02 Tue Jul 13 16:14:05 UTC 2021 [ 100.720369] mlx5_core 7657:00:02.0 enP30295s1: Link up [ 100.723317] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched to VF: enP30295s1 [ 100.725358] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 100.998620] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 100.998645] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 101.014145] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 101.014158] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 101.200683] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched from VF: enP30295s1 [ 101.563292] mlx5_core 7657:00:02.0 enP30295s1: Disabling LRO, not supported in legacy RQ [ 102.208621] mlx5_core 7657:00:02.0 enP30295s1: Link up [ 102.209932] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched to VF: enP30295s1 [ 104.107899] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 104.107922] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 104.123433] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 104.123448] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 106.339018] EXT4-fs (sda1): resizing filesystem from 7836155 to 33526011 blocks [ 111.610942] EXT4-fs (sda1): resized filesystem to 33526011 [ 111.906443] sdb: sdb1 [ 113.006932] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 114.321971] bpfilter: Loaded bpfilter_umh pid 2703 [ 114.322204] Started bpfilter [ 114.330695] new mount options do not match the existing superblock, will be ignored [ 114.596669] nvidia-uvm: Loaded the UVM driver, major device number 236. [ 114.813570] nvidia-nvswitch0: open (major=237) [ 114.814395] nvidia-nvswitch1: open (major=237) [ 114.815041] nvidia-nvswitch2: open (major=237) [ 114.815678] nvidia-nvswitch3: open (major=237) [ 114.816312] nvidia-nvswitch4: open (major=237) [ 114.816987] nvidia-nvswitch5: open (major=237) [ 118.497718] aufs 5.4.3-20200302 [ 127.337669] nvidia-nvlink: nvlink driver open [ 127.337673] nvidia-nvlink: nvlink driver close [ 127.337674] nvidia-nvlink: nvlink driver open [ 134.170252] hv_balloon: Max. dynamic memory size: 921600 MB [ 137.522299] nvidia-nvswitch0: open (major=237) [ 137.522307] nvidia-nvswitch0: open (major=237) [ 137.522310] nvidia-nvswitch1: open (major=237) [ 137.553546] nvidia-nvswitch1: open (major=237) [ 137.553551] nvidia-nvswitch2: open (major=237) [ 137.553554] nvidia-nvswitch2: open (major=237) [ 137.553557] nvidia-nvswitch3: open (major=237) [ 137.553559] nvidia-nvswitch3: open (major=237) [ 137.553562] nvidia-nvswitch4: open (major=237) [ 137.553565] nvidia-nvswitch4: open (major=237) [ 137.553567] nvidia-nvswitch5: open (major=237) [ 137.553570] nvidia-nvswitch5: open (major=237) [ 178.856173] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 178.858986] Bridge firewalling registered [ 180.034442] systemd[1]: Stopping Journal Service... [ 180.034497] systemd-journald[1391]: Received SIGTERM from PID 1 (systemd). [ 180.043291] systemd[1]: Stopped Journal Service. [ 180.046229] systemd[1]: Starting Journal Service... [ 180.063114] systemd[1]: Started Journal Service. [ 199.922850] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation [ 200.147592] IPv6: ADDRCONF(NETDEV_CHANGE): azved375880b3e2: link becomes ready [ 200.147876] IPv6: ADDRCONF(NETDEV_CHANGE): azved375880b3e: link becomes ready [ 200.148623] eth0: renamed from azved375880b3e2 [ 200.513690] kauditd_printk_skb: 4 callbacks suppressed [ 200.513691] audit: type=1400 audit(1649177776.139:16): apparmor="STATUS" operation="profile_load" profile="unconfined" name="cri-containerd.apparmor.d" pid=6852 comm="apparmor_parser" [ 200.514225] audit: type=1400 audit(1649177776.143:17): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="cri-containerd.apparmor.d" pid=6853 comm="apparmor_parser" [ 207.658221] IPv6: ADDRCONF(NETDEV_CHANGE): azv509f35fb0bc2: link becomes ready [ 207.658285] IPv6: ADDRCONF(NETDEV_CHANGE): azv509f35fb0bc: link becomes ready [ 207.659187] eth0: renamed from azv509f35fb0bc2 [ 207.946216] IPv6: ADDRCONF(NETDEV_CHANGE): azv42a008a89bc: link becomes ready [ 207.947113] eth0: renamed from azv42a008a89bc2 [ 208.078715] IPv6: ADDRCONF(NETDEV_CHANGE): azva99aa9e7048: link becomes ready [ 208.079546] eth0: renamed from azva99aa9e70482 [ 280.419599] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.57.02 Tue Jul 13 16:06:24 UTC 2021 [34749.365356] perf: interrupt took too long (2608 > 2500), lowering kernel.perf_event_max_sample_rate to 76500 [36507.207709] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 [36507.207712] CPU: 0 PID: 97420 Comm: python3 Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [36507.207713] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [36507.207714] Call Trace: [36507.207720] dump_stack+0x57/0x6d [36507.207724] dump_header+0x4f/0x200 [36507.207725] oom_kill_process+0xe6/0x120 [36507.207727] out_of_memory+0x117/0x540 [36507.207729] mem_cgroup_out_of_memory+0xbb/0xd0 [36507.207731] try_charge+0x762/0x7c0 [36507.207733] ? __alloc_pages_nodemask+0x153/0x320 [36507.207734] mem_cgroup_try_charge+0x75/0x190 [36507.207735] mem_cgroup_try_charge_delay+0x22/0x50 [36507.207738] __handle_mm_fault+0x943/0x1330 [36507.207739] handle_mm_fault+0xb7/0x200 [36507.207742] __do_page_fault+0x29c/0x4c0 [36507.207743] do_page_fault+0x35/0x110 [36507.207745] page_fault+0x39/0x40 [36507.207747] RIP: 0033:0x4b9692 [36507.207748] Code: 8d 50 ff 49 89 c8 4c 2b 05 9b 55 5b 00 48 8b 41 08 49 bd ab aa aa aa aa aa aa aa 49 c1 f8 04 4d 0f af c5 48 8d b8 00 10 00 00 40 24 ff ff 00 00 44 89 40 20 48 89 79 08 89 51 10 85 d2 0f 84 [36507.207749] RSP: 002b:00007ffebb0e0f50 EFLAGS: 00010a17 [36507.207751] RAX: 00007f6e42791000 RBX: 0000000000000002 RCX: 0000000002717610 [36507.207752] RDX: 0000000000000009 RSI: 0000000000000001 RDI: 00007f6e42792000 [36507.207752] RBP: 0000000000000142 R08: 0000000000000025 R09: 0000000000000008 [36507.207753] R10: 0000000000000001 R11: 0000000000000017 R12: 00007f6e42790fd0 [36507.207753] R13: aaaaaaaaaaaaaaab R14: 0000000000000004 R15: 00000000027a4110 [36507.207755] memory: usage 30720kB, limit 30720kB, failcnt 463 [36507.207755] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [36507.207756] kmem: usage 15696kB, limit 9007199254740988kB, failcnt 0 [36507.207756] Memory cgroup stats for /azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice: [36507.207766] anon 15114240 [36507.207766] file 0 [36507.207766] kernel_stack 0 [36507.207766] slab 10932224 [36507.207766] sock 0 [36507.207766] shmem 0 [36507.207766] file_mapped 0 [36507.207766] file_dirty 270336 [36507.207766] file_writeback 0 [36507.207766] anon_thp 0 [36507.207766] inactive_anon 0 [36507.207766] active_anon 15679488 [36507.207766] inactive_file 0 [36507.207766] active_file 159744 [36507.207766] unevictable 0 [36507.207766] slab_reclaimable 991232 [36507.207766] slab_unreclaimable 9940992 [36507.207766] pgfault 139623 [36507.207766] pgmajfault 0 [36507.207766] workingset_refault 132 [36507.207766] workingset_activate 0 [36507.207766] workingset_nodereclaim 0 [36507.207766] pgrefill 127 [36507.207766] pgscan 860 [36507.207766] pgsteal 835 [36507.207766] pgactivate 0 [36507.207766] pgdeactivate 127 [36507.207766] pglazyfree 0 [36507.207766] pglazyfreed 0 [36507.207766] thp_fault_alloc 0 [36507.207766] Tasks state (memory values in pages): [36507.207767] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [36507.207768] [ 97420] 0 97420 20527 6056 208896 0 0 python3 [36507.207769] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-3,oom_memcg=/azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice,task_memcg=/azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice,task=python3,pid=97420,uid=0 [36507.207776] Memory cgroup out of memory: Killed process 97420 (python3) total-vm:82108kB, anon-rss:14956kB, file-rss:9268kB, shmem-rss:0kB, UID:0 pgtables:204kB oom_score_adj:0 [36507.216397] oom_reaper: reaped process 97420 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [44517.301404] tritonserver invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=999 [44517.301407] CPU: 4 PID: 51660 Comm: tritonserver Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [44517.301408] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [44517.301409] Call Trace: [44517.301415] dump_stack+0x57/0x6d [44517.301420] dump_header+0x4f/0x200 [44517.301422] oom_kill_process+0xe6/0x120 [44517.301423] out_of_memory+0x117/0x540 [44517.301426] mem_cgroup_out_of_memory+0xbb/0xd0 [44517.301427] try_charge+0x762/0x7c0 [44517.301431] ? blk_flush_plug_list+0xd1/0x100 [44517.301433] mem_cgroup_try_charge+0x75/0x190 [44517.301435] __add_to_page_cache_locked+0x21a/0x3d0 [44517.301437] ? scan_shadow_nodes+0x30/0x30 [44517.301438] add_to_page_cache_lru+0x4f/0xd0 [44517.301440] pagecache_get_page+0xea/0x2c0 [44517.301441] filemap_fault+0x669/0xb60 [44517.301442] ? unlock_page_memcg+0x12/0x20 [44517.301444] ? page_add_file_rmap+0x13a/0x180 [44517.301446] ? xas_load+0xc/0x80 [44517.301447] ? xas_find+0x16f/0x1b0 [44517.301448] ? filemap_map_pages+0x17d/0x3b0 [44517.301451] ext4_filemap_fault+0x31/0x50 [44517.301453] __do_fault+0x57/0x110 [44517.301455] __handle_mm_fault+0xdf1/0x1330 [44517.301457] handle_mm_fault+0xb7/0x200 [44517.301460] __do_page_fault+0x29c/0x4c0 [44517.301461] do_page_fault+0x35/0x110 [44517.301463] page_fault+0x39/0x40 [44517.301465] RIP: 0033:0x7f11b3d37c88 [44517.301469] Code: Bad RIP value. [44517.301470] RSP: 002b:00007ffcaf6fae90 EFLAGS: 00010216 [44517.301471] RAX: 0000000000000000 RBX: 00005560fd5324f0 RCX: 00005560fd531510 [44517.301472] RDX: 0000000000000018 RSI: 0000000000000084 RDI: 00005560fd5324f0 [44517.301473] RBP: 00005560fd5325f8 R08: 00005560fd531520 R09: 00005560fd2d12c0 [44517.301473] R10: ffffffffffffffff R11: 00007f11ea0b4be0 R12: 0000000000000008 [44517.301474] R13: 0000000000200000 R14: 00005560fd2db440 R15: 0000000000000000 [44517.301476] memory: usage 131072kB, limit 131072kB, failcnt 15691 [44517.301477] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [44517.301477] kmem: usage 19932kB, limit 9007199254740988kB, failcnt 0 [44517.301478] Memory cgroup stats for /kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909: [44517.301490] anon 110723072 [44517.301490] file 4108288 [44517.301490] kernel_stack 737280 [44517.301490] slab 11878400 [44517.301490] sock 0 [44517.301490] shmem 4190208 [44517.301490] file_mapped 4190208 [44517.301490] file_dirty 0 [44517.301490] file_writeback 0 [44517.301490] anon_thp 18874368 [44517.301490] inactive_anon 4190208 [44517.301490] active_anon 111054848 [44517.301490] inactive_file 0 [44517.301490] active_file 172032 [44517.301490] unevictable 0 [44517.301490] slab_reclaimable 2527232 [44517.301490] slab_unreclaimable 9351168 [44517.301490] pgfault 487938 [44517.301490] pgmajfault 429 [44517.301490] workingset_refault 14388 [44517.301490] workingset_activate 198 [44517.301490] workingset_nodereclaim 0 [44517.301490] pgrefill 14559 [44517.301490] pgscan 103484 [44517.301490] pgsteal 15359 [44517.301490] pgactivate 14190 [44517.301490] pgdeactivate 14376 [44517.301490] pglazyfree 0 [44517.301491] Tasks state (memory values in pages): [44517.301492] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [44517.301494] [ 8132] 65535 8132 241 1 28672 0 -998 pause [44517.301496] [ 11938] 0 11938 994 756 45056 0 999 bash [44517.301497] [ 50276] 0 50276 1060 892 53248 0 999 bash [44517.301499] [ 51500] 0 51500 627 148 40960 0 999 sleep [44517.301500] [ 51653] 0 51653 25429 2395 77824 0 999 mpirun [44517.301501] [ 51660] 0 51660 9879422 107763 1716224 0 999 tritonserver [44517.301502] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,mems_allowed=0-3,oom_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909,task_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909/92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,task=tritonserver,pid=51660,uid=0 [44517.301562] Memory cgroup out of memory: Killed process 51660 (tritonserver) total-vm:39517688kB, anon-rss:102208kB, file-rss:324748kB, shmem-rss:4096kB, UID:0 pgtables:1676kB oom_score_adj:999 [44517.317788] oom_reaper: reaped process 51660 (tritonserver), now anon-rss:0kB, file-rss:71760kB, shmem-rss:4096kB [44930.378048] tritonserver invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=999 [44930.378052] CPU: 76 PID: 59249 Comm: tritonserver Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [44930.378053] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [44930.378054] Call Trace: [44930.378065] dump_stack+0x57/0x6d [44930.378070] dump_header+0x4f/0x200 [44930.378071] oom_kill_process+0xe6/0x120 [44930.378073] out_of_memory+0x117/0x540 [44930.378077] mem_cgroup_out_of_memory+0xbb/0xd0 [44930.378078] try_charge+0x762/0x7c0 [44930.378082] ? __alloc_pages_nodemask+0x153/0x320 [44930.378083] mem_cgroup_try_charge+0x75/0x190 [44930.378084] mem_cgroup_try_charge_delay+0x22/0x50 [44930.378087] __handle_mm_fault+0x943/0x1330 [44930.378089] handle_mm_fault+0xb7/0x200 [44930.378092] __do_page_fault+0x29c/0x4c0 [44930.378093] do_page_fault+0x35/0x110 [44930.378096] page_fault+0x39/0x40 [44930.378098] RIP: 0033:0x7fcf23292d51 [44930.378100] Code: 3b 15 8b 57 13 00 0f 87 cc 00 00 00 0f 10 06 0f 10 4e 10 0f 10 56 20 0f 10 5e 30 48 83 c6 40 48 83 ea 40 0f 29 07 0f 29 4f 10 <0f> 29 57 20 0f 29 5f 30 48 83 c7 40 48 83 fa 40 77 d0 0f 11 29 0f [44930.378101] RSP: 002b:00007ffe43e292b8 EFLAGS: 00010202 [44930.378102] RAX: 00007fce51ee1010 RBX: 00007ffe43e29440 RCX: 00007fce51fedd90 [44930.378102] RDX: 00000000000a8d80 RSI: 00007fce61f4aa50 RDI: 00007fce51f44fe0 [44930.378103] RBP: 00007ffe43e29510 R08: fffffffffffffff0 R09: 0000000000000000 [44930.378103] R10: 0000000000000022 R11: 00007fce51ee1010 R12: 00005598788d0680 [44930.378104] R13: 0000559879d1dac0 R14: 0000559879d05fe0 R15: 000000000010cd90 [44930.378106] memory: usage 131072kB, limit 131072kB, failcnt 95 [44930.378106] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [44930.378107] kmem: usage 21928kB, limit 9007199254740988kB, failcnt 0 [44930.378107] Memory cgroup stats for /kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce: [44930.378117] anon 106221568 [44930.378117] file 6344704 [44930.378117] kernel_stack 552960 [44930.378117] slab 12148736 [44930.378117] sock 0 [44930.378117] shmem 6217728 [44930.378117] file_mapped 6217728 [44930.378117] file_dirty 0 [44930.378117] file_writeback 0 [44930.378117] anon_thp 18874368 [44930.378117] inactive_anon 6217728 [44930.378117] active_anon 106737664 [44930.378117] inactive_file 40960 [44930.378117] active_file 0 [44930.378117] unevictable 0 [44930.378117] slab_reclaimable 4120576 [44930.378117] slab_unreclaimable 8028160 [44930.378117] pgfault 491865 [44930.378117] pgmajfault 0 [44930.378117] workingset_refault 0 [44930.378117] workingset_activate 0 [44930.378117] workingset_nodereclaim 0 [44930.378117] pgrefill 251 [44930.378117] pgscan 409 [44930.378117] pgsteal 52 [44930.378117] pgactivate 264 [44930.378117] pgdeactivate 251 [44930.378117] pglazyfree 0 [44930.378117] pglazyfreed 0 [44930.378118] Tasks state (memory values in pages): [44930.378118] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [44930.378120] [ 7857] 65535 7857 241 1 28672 0 -998 pause [44930.378122] [ 9563] 0 9563 974 720 49152 0 999 bash [44930.378124] [ 57940] 0 57940 1060 896 40960 0 999 bash [44930.378125] [ 59165] 0 59165 630 146 45056 0 999 sleep [44930.378127] [ 59241] 0 59241 25458 2476 77824 0 999 mpirun [44930.378128] [ 59249] 0 59249 9743705 105957 1683456 0 999 tritonserver [44930.378129] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,mems_allowed=0-3,oom_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce,task_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce/a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,task=tritonserver,pid=59249,uid=0 [44930.378157] Memory cgroup out of memory: Killed process 59249 (tritonserver) total-vm:38974820kB, anon-rss:98544kB, file-rss:319140kB, shmem-rss:6144kB, UID:0 pgtables:1644kB oom_score_adj:999 [44930.395289] oom_reaper: reaped process 59249 (tritonserver), now anon-rss:0kB, file-rss:73808kB, shmem-rss:6144kB [45019.521612] tritonserver invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=999 [45019.521617] CPU: 78 PID: 60997 Comm: tritonserver Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [45019.521618] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [45019.521619] Call Trace: [45019.521629] dump_stack+0x57/0x6d [45019.521635] dump_header+0x4f/0x200 [45019.521636] oom_kill_process+0xe6/0x120 [45019.521637] out_of_memory+0x117/0x540 [45019.521641] mem_cgroup_out_of_memory+0xbb/0xd0 [45019.521642] try_charge+0x762/0x7c0 [45019.521646] ? __alloc_pages_nodemask+0x153/0x320 [45019.521647] mem_cgroup_try_charge+0x75/0x190 [45019.521647] mem_cgroup_try_charge_delay+0x22/0x50 [45019.521651] __handle_mm_fault+0x943/0x1330 [45019.521653] handle_mm_fault+0xb7/0x200 [45019.521656] __do_page_fault+0x29c/0x4c0 [45019.521657] do_page_fault+0x35/0x110 [45019.521659] page_fault+0x39/0x40 [45019.521662] RIP: 0033:0x7f0b8901c76c [45019.521663] Code: 28 48 8b 6c 24 20 48 39 d3 48 89 4b 60 0f 95 c2 48 83 c8 01 49 83 c0 10 0f b6 d2 48 c1 e2 02 4c 09 ea 48 83 ca 01 49 89 50 f8 <48> 89 41 08 8b 05 16 67 15 00 85 c0 0f 84 c4 f6 ff ff e9 a8 fc ff [45019.521664] RSP: 002b:00007ffd63dbae50 EFLAGS: 00010202 [45019.521665] RAX: 0000000000005511 RBX: 00007f0b89170b80 RCX: 00005555c0515af0 [45019.521666] RDX: 0000000000001601 RSI: 0000000000000000 RDI: 0000000000000003 [45019.521666] RBP: 00000000000015ef R08: 00005555c0514500 R09: 000000000000007c [45019.521667] R10: 00005555bcf95010 R11: 00007f0b89170be0 R12: ffffffffffffff90 [45019.521667] R13: 0000000000001600 R14: ffffffffffffff90 R15: 000000000000015e [45019.521669] memory: usage 131072kB, limit 131072kB, failcnt 194 [45019.521670] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [45019.521670] kmem: usage 29908kB, limit 9007199254740988kB, failcnt 0 [45019.521671] Memory cgroup stats for /kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce: [45019.521685] anon 105197568 [45019.521685] file 0 [45019.521685] kernel_stack 663552 [45019.521685] slab 18673664 [45019.521685] sock 0 [45019.521685] shmem 0 [45019.521685] file_mapped 0 [45019.521685] file_dirty 0 [45019.521685] file_writeback 0 [45019.521685] anon_thp 20971520 [45019.521685] inactive_anon 0 [45019.521685] active_anon 105385984 [45019.521685] inactive_file 69632 [45019.521685] active_file 0 [45019.521685] unevictable 0 [45019.521685] slab_reclaimable 5844992 [45019.521685] slab_unreclaimable 12828672 [45019.521685] pgfault 525228 [45019.521685] pgmajfault 0 [45019.521685] workingset_refault 0 [45019.521685] workingset_activate 0 [45019.521685] workingset_nodereclaim 0 [45019.521685] pgrefill 465 [45019.521685] pgscan 778 [45019.521685] pgsteal 125 [45019.521685] pgactivate 495 [45019.521685] pgdeactivate 465 [45019.521685] pglazyfree 0 [45019.521685] pglazyfreed 0 [45019.521685] thp_fault_alloc 0 [45019.521686] Tasks state (memory values in pages): [45019.521686] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [45019.521688] [ 7857] 65535 7857 241 1 28672 0 -998 pause [45019.521690] [ 9563] 0 9563 974 720 49152 0 999 bash [45019.521691] [ 60753] 0 60753 1060 887 45056 0 999 bash [45019.521692] [ 60919] 0 60919 630 147 40960 0 999 sleep [45019.521694] [ 60992] 0 60992 25458 2426 77824 0 999 mpirun [45019.521695] [ 60997] 0 60997 9695196 83217 1458176 0 999 tritonserver [45019.521695] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,mems_allowed=0-3,oom_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce,task_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce/a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,task=tritonserver,pid=60997,uid=0 [45019.521714] Memory cgroup out of memory: Killed process 60997 (tritonserver) total-vm:38780784kB, anon-rss:96564kB, file-rss:236304kB, shmem-rss:0kB, UID:0 pgtables:1424kB oom_score_adj:999 [45019.542213] oom_reaper: reaped process 60997 (tritonserver), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ```

Triton 21.07

dmesg

``` root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# dmesg [ 0.000000] Linux version 5.4.0-1072-azure (buildd@lcy02-amd64-106) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #75~18.04.1-Ubuntu SMP Wed Mar 2 14:41:08 UTC 2022 (Ubuntu 5.4.0-1072.75~18.04.1-azure 5.4.166) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1072-azure root=UUID=175da400-d3a9-4b28-bf4f-85f7ca8784f5 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] AMD AuthenticAMD [ 0.000000] Hygon HygonGenuine [ 0.000000] Centaur CentaurHauls [ 0.000000] zhaoxin Shanghai [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000000c0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003ed49fff] usable [ 0.000000] BIOS-e820: [mem 0x000000003ed4a000-0x000000003ed4bfff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000003ed4c000-0x000000003ee7afff] usable [ 0.000000] BIOS-e820: [mem 0x000000003ee7b000-0x000000003ee99fff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000003ee9a000-0x000000003eef1fff] usable [ 0.000000] BIOS-e820: [mem 0x000000003eef2000-0x000000003ef1afff] reserved [ 0.000000] BIOS-e820: [mem 0x000000003ef1b000-0x000000003ff9afff] usable [ 0.000000] BIOS-e820: [mem 0x000000003ff9b000-0x000000003fff2fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000003fff3000-0x000000003fffafff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000003fffb000-0x000000003fffefff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000003ffff000-0x000000003fffffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000fdfffffff] usable [ 0.000000] BIOS-e820: [mem 0x0000010fe0000000-0x000001e1bfffffff] usable [ 0.000000] printk: bootconsole [earlyser0] enabled [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] efi: EFI v2.70 by Microsoft [ 0.000000] efi: ACPI=0x3fffa000 ACPI 2.0=0x3fffa014 SMBIOS=0x3ffd8000 SMBIOS 3.0=0x3ffd6000 MEMATTR=0x3f9da018 MOKvar=0x3f29c000 [ 0.000000] secureboot: Secure boot disabled [ 0.000000] SMBIOS 3.1.0 present. [ 0.000000] DMI: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [ 0.000000] Hypervisor detected: Microsoft Hyper-V [ 0.000000] Hyper-V: features 0xae7f, hints 0x40c2c, misc 0x40bed7b6 [ 0.000000] Hyper-V Host Build:19645-10.0-1-0.1258 [ 0.000000] Hyper-V: LAPIC Timer Frequency: 0xc3500 [ 0.000000] Hyper-V: Using hypercall for remote TLB flush [ 0.000000] clocksource: hyperv_clocksource_tsc_page: mask: 0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120 ns [ 0.000002] tsc: Detected 2445.409 MHz processor [ 0.002046] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [ 0.002048] e820: remove [mem 0x000a0000-0x000fffff] usable [ 0.002051] last_pfn = 0x1e1c0000 max_arch_pfn = 0x400000000 [ 0.004551] MTRR default type: uncachable [ 0.004552] MTRR fixed ranges enabled: [ 0.004553] 00000-9FFFF write-back [ 0.004554] A0000-FFFFF uncachable [ 0.004554] MTRR variable ranges enabled: [ 0.004555] 0 base 000000000000 mask FFFFC0000000 write-back [ 0.004556] 1 base 000100000000 mask FFF000000000 write-back [ 0.004556] 2 base 010FE0000000 mask F80000000000 write-back [ 0.004557] 3 base 080000000000 mask 000000000000 write-back [ 0.004557] 4 disabled [ 0.004558] 5 disabled [ 0.004558] 6 disabled [ 0.004558] 7 disabled [ 0.004569] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.007594] e820: update [mem 0x40000000-0xffffffff] usable ==> reserved [ 0.007595] e820: update [mem 0x1100000000-0x10fdfffffff] usable ==> reserved [ 0.009064] last_pfn = 0x40000 max_arch_pfn = 0x400000000 [ 0.015401] e820: update [mem 0x3f29c000-0x3f29cfff] usable ==> reserved [ 0.015412] check: Scanning 1 areas for low memory corruption [ 0.017967] Using GB pages for direct mapping [ 0.020186] secureboot: Secure boot disabled [ 0.022087] RAMDISK: [mem 0x2d2ff000-0x2efe1fff] [ 0.024131] ACPI: Early table checksum verification disabled [ 0.026645] ACPI: RSDP 0x000000003FFFA014 000024 (v02 VRTUAL) [ 0.029159] ACPI: XSDT 0x000000003FFF90E8 000064 (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.032909] ACPI: FACP 0x000000003FFF8000 000114 (v06 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.036734] ACPI: DSDT 0x000000003EE7B000 01E184 (v02 MSFTVM DSDT01 00000001 MSFT 05000000) [ 0.040533] ACPI: FACS 0x000000003FFFE000 000040 [ 0.042648] ACPI: OEM0 0x000000003FFF7000 000064 (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.046392] ACPI: WAET 0x000000003FFF6000 000028 (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.050176] ACPI: APIC 0x000000003FFF5000 000348 (v04 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.053960] ACPI: SRAT 0x000000003FFF4000 000EA0 (v02 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.057750] ACPI: SLIT 0x000000003FFF3000 00003C (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.061544] ACPI: BGRT 0x000000003ED4B000 000038 (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.065354] ACPI: FPDT 0x000000003ED4A000 000034 (v01 VRTUAL MICROSFT 00000001 MSFT 00000001) [ 0.069125] ACPI: Reserving FACP table memory at [mem 0x3fff8000-0x3fff8113] [ 0.072205] ACPI: Reserving DSDT table memory at [mem 0x3ee7b000-0x3ee99183] [ 0.075263] ACPI: Reserving FACS table memory at [mem 0x3fffe000-0x3fffe03f] [ 0.078363] ACPI: Reserving OEM0 table memory at [mem 0x3fff7000-0x3fff7063] [ 0.081434] ACPI: Reserving WAET table memory at [mem 0x3fff6000-0x3fff6027] [ 0.084524] ACPI: Reserving APIC table memory at [mem 0x3fff5000-0x3fff5347] [ 0.087594] ACPI: Reserving SRAT table memory at [mem 0x3fff4000-0x3fff4e9f] [ 0.090673] ACPI: Reserving SLIT table memory at [mem 0x3fff3000-0x3fff303b] [ 0.093758] ACPI: Reserving BGRT table memory at [mem 0x3ed4b000-0x3ed4b037] [ 0.096859] ACPI: Reserving FPDT table memory at [mem 0x3ed4a000-0x3ed4a033] [ 0.099972] ACPI: Local APIC address 0xfee00000 [ 0.100015] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.101969] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.103891] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.105825] SRAT: PXM 0 -> APIC 0x03 -> Node 0 [ 0.107769] SRAT: PXM 0 -> APIC 0x04 -> Node 0 [ 0.109746] SRAT: PXM 0 -> APIC 0x05 -> Node 0 [ 0.111725] SRAT: PXM 0 -> APIC 0x06 -> Node 0 [ 0.113699] SRAT: PXM 0 -> APIC 0x07 -> Node 0 [ 0.115665] SRAT: PXM 0 -> APIC 0x08 -> Node 0 [ 0.117629] SRAT: PXM 0 -> APIC 0x09 -> Node 0 [ 0.119585] SRAT: PXM 0 -> APIC 0x0a -> Node 0 [ 0.121530] SRAT: PXM 0 -> APIC 0x0b -> Node 0 [ 0.123486] SRAT: PXM 0 -> APIC 0x0c -> Node 0 [ 0.125495] SRAT: PXM 0 -> APIC 0x0d -> Node 0 [ 0.127460] SRAT: PXM 0 -> APIC 0x0e -> Node 0 [ 0.129419] SRAT: PXM 0 -> APIC 0x0f -> Node 0 [ 0.131369] SRAT: PXM 0 -> APIC 0x10 -> Node 0 [ 0.133325] SRAT: PXM 0 -> APIC 0x11 -> Node 0 [ 0.135269] SRAT: PXM 0 -> APIC 0x12 -> Node 0 [ 0.137209] SRAT: PXM 0 -> APIC 0x13 -> Node 0 [ 0.139166] SRAT: PXM 0 -> APIC 0x14 -> Node 0 [ 0.141159] SRAT: PXM 0 -> APIC 0x15 -> Node 0 [ 0.143118] SRAT: PXM 0 -> APIC 0x16 -> Node 0 [ 0.145085] SRAT: PXM 0 -> APIC 0x17 -> Node 0 [ 0.147041] SRAT: PXM 1 -> APIC 0x18 -> Node 1 [ 0.149004] SRAT: PXM 1 -> APIC 0x19 -> Node 1 [ 0.150954] SRAT: PXM 1 -> APIC 0x1a -> Node 1 [ 0.152925] SRAT: PXM 1 -> APIC 0x1b -> Node 1 [ 0.154908] SRAT: PXM 1 -> APIC 0x1c -> Node 1 [ 0.156885] SRAT: PXM 1 -> APIC 0x1d -> Node 1 [ 0.158852] SRAT: PXM 1 -> APIC 0x1e -> Node 1 [ 0.160822] SRAT: PXM 1 -> APIC 0x1f -> Node 1 [ 0.162796] SRAT: PXM 1 -> APIC 0x20 -> Node 1 [ 0.164751] SRAT: PXM 1 -> APIC 0x21 -> Node 1 [ 0.166701] SRAT: PXM 1 -> APIC 0x22 -> Node 1 [ 0.168653] SRAT: PXM 1 -> APIC 0x23 -> Node 1 [ 0.170600] SRAT: PXM 1 -> APIC 0x24 -> Node 1 [ 0.172592] SRAT: PXM 1 -> APIC 0x25 -> Node 1 [ 0.174532] SRAT: PXM 1 -> APIC 0x26 -> Node 1 [ 0.176473] SRAT: PXM 1 -> APIC 0x27 -> Node 1 [ 0.178420] SRAT: PXM 1 -> APIC 0x28 -> Node 1 [ 0.180347] SRAT: PXM 1 -> APIC 0x29 -> Node 1 [ 0.182310] SRAT: PXM 1 -> APIC 0x2a -> Node 1 [ 0.184256] SRAT: PXM 1 -> APIC 0x2b -> Node 1 [ 0.186221] SRAT: PXM 1 -> APIC 0x2c -> Node 1 [ 0.188279] SRAT: PXM 1 -> APIC 0x2d -> Node 1 [ 0.190242] SRAT: PXM 1 -> APIC 0x2e -> Node 1 [ 0.192228] SRAT: PXM 1 -> APIC 0x2f -> Node 1 [ 0.194375] SRAT: PXM 2 -> APIC 0x40 -> Node 2 [ 0.196316] SRAT: PXM 2 -> APIC 0x41 -> Node 2 [ 0.198186] SRAT: PXM 2 -> APIC 0x42 -> Node 2 [ 0.200053] SRAT: PXM 2 -> APIC 0x43 -> Node 2 [ 0.201989] SRAT: PXM 2 -> APIC 0x44 -> Node 2 [ 0.203948] SRAT: PXM 2 -> APIC 0x45 -> Node 2 [ 0.205911] SRAT: PXM 2 -> APIC 0x46 -> Node 2 [ 0.207837] SRAT: PXM 2 -> APIC 0x47 -> Node 2 [ 0.209641] SRAT: PXM 2 -> APIC 0x48 -> Node 2 [ 0.211518] SRAT: PXM 2 -> APIC 0x49 -> Node 2 [ 0.213309] SRAT: PXM 2 -> APIC 0x4a -> Node 2 [ 0.215245] SRAT: PXM 2 -> APIC 0x4b -> Node 2 [ 0.217161] SRAT: PXM 2 -> APIC 0x4c -> Node 2 [ 0.219157] SRAT: PXM 2 -> APIC 0x4d -> Node 2 [ 0.221102] SRAT: PXM 2 -> APIC 0x4e -> Node 2 [ 0.223048] SRAT: PXM 2 -> APIC 0x4f -> Node 2 [ 0.224998] SRAT: PXM 2 -> APIC 0x50 -> Node 2 [ 0.226960] SRAT: PXM 2 -> APIC 0x51 -> Node 2 [ 0.228940] SRAT: PXM 2 -> APIC 0x52 -> Node 2 [ 0.230882] SRAT: PXM 2 -> APIC 0x53 -> Node 2 [ 0.232813] SRAT: PXM 2 -> APIC 0x54 -> Node 2 [ 0.234823] SRAT: PXM 2 -> APIC 0x55 -> Node 2 [ 0.236794] SRAT: PXM 2 -> APIC 0x56 -> Node 2 [ 0.238700] SRAT: PXM 2 -> APIC 0x57 -> Node 2 [ 0.240667] SRAT: PXM 3 -> APIC 0x58 -> Node 3 [ 0.242655] SRAT: PXM 3 -> APIC 0x59 -> Node 3 [ 0.244589] SRAT: PXM 3 -> APIC 0x5a -> Node 3 [ 0.246510] SRAT: PXM 3 -> APIC 0x5b -> Node 3 [ 0.248469] SRAT: PXM 3 -> APIC 0x5c -> Node 3 [ 0.250379] SRAT: PXM 3 -> APIC 0x5d -> Node 3 [ 0.252271] SRAT: PXM 3 -> APIC 0x5e -> Node 3 [ 0.254139] SRAT: PXM 3 -> APIC 0x5f -> Node 3 [ 0.256000] SRAT: PXM 3 -> APIC 0x60 -> Node 3 [ 0.257842] SRAT: PXM 3 -> APIC 0x61 -> Node 3 [ 0.259686] SRAT: PXM 3 -> APIC 0x62 -> Node 3 [ 0.261524] SRAT: PXM 3 -> APIC 0x63 -> Node 3 [ 0.263365] SRAT: PXM 3 -> APIC 0x64 -> Node 3 [ 0.265259] SRAT: PXM 3 -> APIC 0x65 -> Node 3 [ 0.267135] SRAT: PXM 3 -> APIC 0x66 -> Node 3 [ 0.269012] SRAT: PXM 3 -> APIC 0x67 -> Node 3 [ 0.270877] SRAT: PXM 3 -> APIC 0x68 -> Node 3 [ 0.272723] SRAT: PXM 3 -> APIC 0x69 -> Node 3 [ 0.274571] SRAT: PXM 3 -> APIC 0x6a -> Node 3 [ 0.276414] SRAT: PXM 3 -> APIC 0x6b -> Node 3 [ 0.278254] SRAT: PXM 3 -> APIC 0x6c -> Node 3 [ 0.280090] SRAT: PXM 3 -> APIC 0x6d -> Node 3 [ 0.282062] SRAT: PXM 3 -> APIC 0x6e -> Node 3 [ 0.283925] SRAT: PXM 3 -> APIC 0x6f -> Node 3 [ 0.285789] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x3fffffff] hotplug [ 0.288562] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfdfffffff] hotplug [ 0.291399] ACPI: SRAT: Node 0 PXM 0 [mem 0x10fe0000000-0x138ffffffff] hotplug [ 0.294376] ACPI: SRAT: Node 1 PXM 1 [mem 0x13900000000-0x1713fffffff] hotplug [ 0.297394] ACPI: SRAT: Node 2 PXM 2 [mem 0x17140000000-0x1a97fffffff] hotplug [ 0.300474] ACPI: SRAT: Node 3 PXM 3 [mem 0x1a980000000-0x1e1bfffffff] hotplug [ 0.303609] ACPI: SRAT: Node 0 PXM 0 [mem 0x1e1c0000000-0x1e94fffffff] hotplug [ 0.306662] ACPI: SRAT: Node 0 PXM 0 [mem 0x20000000000-0x27fffffffff] hotplug [ 0.309765] ACPI: SRAT: Node 0 PXM 0 [mem 0x40000000000-0x4ffffffffff] hotplug [ 0.312862] ACPI: SRAT: Node 0 PXM 0 [mem 0x80000000000-0x9ffffffffff] hotplug [ 0.315847] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000000-0x13ffffffffff] hotplug [ 0.319037] ACPI: SRAT: Node 0 PXM 0 [mem 0x200000000000-0x27ffffffffff] hotplug [ 0.322229] ACPI: SRAT: Node 0 PXM 0 [mem 0x400000000000-0x4fffffffffff] hotplug [ 0.325280] ACPI: SRAT: Node 0 PXM 0 [mem 0x800000000000-0x9fffffffffff] hotplug [ 0.328467] ACPI: SRAT: Node 0 PXM 0 [mem 0x1000000000000-0x13fffffffffff] hotplug [ 0.331625] ACPI: SRAT: Node 0 PXM 0 [mem 0x2000000000000-0x27fffffffffff] hotplug [ 0.334785] ACPI: SRAT: Node 0 PXM 0 [mem 0x4000000000000-0x4ffffffffffff] hotplug [ 0.337944] ACPI: SRAT: Node 0 PXM 0 [mem 0x8000000000000-0x9ffffffffffff] hotplug [ 0.341084] ACPI: SRAT: Node 1 PXM 1 [mem 0x1e950000000-0x1f0dfffffff] hotplug [ 0.344162] ACPI: SRAT: Node 1 PXM 1 [mem 0x28000000000-0x2ffffffffff] hotplug [ 0.347245] ACPI: SRAT: Node 1 PXM 1 [mem 0x50000000000-0x5ffffffffff] hotplug [ 0.350340] ACPI: SRAT: Node 1 PXM 1 [mem 0xa0000000000-0xbffffffffff] hotplug [ 0.353418] ACPI: SRAT: Node 1 PXM 1 [mem 0x140000000000-0x17ffffffffff] hotplug [ 0.356581] ACPI: SRAT: Node 1 PXM 1 [mem 0x280000000000-0x2fffffffffff] hotplug [ 0.359815] ACPI: SRAT: Node 1 PXM 1 [mem 0x500000000000-0x5fffffffffff] hotplug [ 0.363000] ACPI: SRAT: Node 1 PXM 1 [mem 0xa00000000000-0xbfffffffffff] hotplug [ 0.366172] ACPI: SRAT: Node 1 PXM 1 [mem 0x1400000000000-0x17fffffffffff] hotplug [ 0.369455] ACPI: SRAT: Node 1 PXM 1 [mem 0x2800000000000-0x2ffffffffffff] hotplug [ 0.372688] ACPI: SRAT: Node 1 PXM 1 [mem 0x5000000000000-0x5ffffffffffff] hotplug [ 0.376476] ACPI: SRAT: Node 1 PXM 1 [mem 0xa000000000000-0xbffffffffffff] hotplug [ 0.379785] ACPI: SRAT: Node 2 PXM 2 [mem 0x1f0e0000000-0x1f86fffffff] hotplug [ 0.382962] ACPI: SRAT: Node 2 PXM 2 [mem 0x30000000000-0x37fffffffff] hotplug [ 0.386103] ACPI: SRAT: Node 2 PXM 2 [mem 0x60000000000-0x6ffffffffff] hotplug [ 0.389271] ACPI: SRAT: Node 2 PXM 2 [mem 0xc0000000000-0xdffffffffff] hotplug [ 0.393043] ACPI: SRAT: Node 2 PXM 2 [mem 0x180000000000-0x1bffffffffff] hotplug [ 0.396343] ACPI: SRAT: Node 2 PXM 2 [mem 0x300000000000-0x37ffffffffff] hotplug [ 0.399586] ACPI: SRAT: Node 2 PXM 2 [mem 0x600000000000-0x6fffffffffff] hotplug [ 0.402774] ACPI: SRAT: Node 2 PXM 2 [mem 0xc00000000000-0xdfffffffffff] hotplug [ 0.405822] ACPI: SRAT: Node 2 PXM 2 [mem 0x1800000000000-0x1bfffffffffff] hotplug [ 0.409103] ACPI: SRAT: Node 2 PXM 2 [mem 0x3000000000000-0x37fffffffffff] hotplug [ 0.412327] ACPI: SRAT: Node 2 PXM 2 [mem 0x6000000000000-0x6ffffffffffff] hotplug [ 0.415579] ACPI: SRAT: Node 2 PXM 2 [mem 0xc000000000000-0xdffffffffffff] hotplug [ 0.418864] ACPI: SRAT: Node 3 PXM 3 [mem 0x1f870000000-0x1ffffffffff] hotplug [ 0.422020] ACPI: SRAT: Node 3 PXM 3 [mem 0x38000000000-0x3ffffffffff] hotplug [ 0.425217] ACPI: SRAT: Node 3 PXM 3 [mem 0x70000000000-0x7ffffffffff] hotplug [ 0.428400] ACPI: SRAT: Node 3 PXM 3 [mem 0xe0000000000-0xfffffffffff] hotplug [ 0.431575] ACPI: SRAT: Node 3 PXM 3 [mem 0x1c0000000000-0x1fffffffffff] hotplug [ 0.434822] ACPI: SRAT: Node 3 PXM 3 [mem 0x380000000000-0x3fffffffffff] hotplug [ 0.438185] ACPI: SRAT: Node 3 PXM 3 [mem 0x700000000000-0x7fffffffffff] hotplug [ 0.441431] ACPI: SRAT: Node 3 PXM 3 [mem 0xe00000000000-0xffffffffffff] hotplug [ 0.444680] ACPI: SRAT: Node 3 PXM 3 [mem 0x1c00000000000-0x1ffffffffffff] hotplug [ 0.447982] ACPI: SRAT: Node 3 PXM 3 [mem 0x3800000000000-0x3ffffffffffff] hotplug [ 0.451305] ACPI: SRAT: Node 3 PXM 3 [mem 0x7000000000000-0x7ffffffffffff] hotplug [ 0.454701] ACPI: SRAT: Node 3 PXM 3 [mem 0xe000000000000-0xfffffffffffff] hotplug [ 0.458067] NUMA: Initialized distance table, cnt=4 [ 0.458071] NUMA: Node 0 [mem 0x00000000-0x3fffffff] + [mem 0x100000000-0xfdfffffff] -> [mem 0x00000000-0xfdfffffff] [ 0.462670] NUMA: Node 0 [mem 0x00000000-0xfdfffffff] + [mem 0x10fe0000000-0x138ffffffff] -> [mem 0x00000000-0x138ffffffff] [ 0.467504] NODE_DATA(0) allocated [mem 0x138fffd5000-0x138ffffffff] [ 0.470396] NODE_DATA(1) allocated [mem 0x1713ffd5000-0x1713fffffff] [ 0.473247] NODE_DATA(2) allocated [mem 0x1a97ffd5000-0x1a97fffffff] [ 0.476093] NODE_DATA(3) allocated [mem 0x1e1bffd2000-0x1e1bfffcfff] [ 0.479933] Zone ranges: [ 0.481102] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.483894] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] [ 0.486837] Normal [mem 0x0000000100000000-0x000001e1bfffffff] [ 0.489595] Device empty [ 0.490924] Movable zone start for each node [ 0.492825] Early memory node ranges [ 0.494428] node 0: [mem 0x0000000000001000-0x000000000009ffff] [ 0.497196] node 0: [mem 0x0000000000100000-0x000000003ed49fff] [ 0.500018] node 0: [mem 0x000000003ed4c000-0x000000003ee7afff] [ 0.502815] node 0: [mem 0x000000003ee9a000-0x000000003eef1fff] [ 0.505612] node 0: [mem 0x000000003ef1b000-0x000000003ff9afff] [ 0.508422] node 0: [mem 0x000000003ffff000-0x000000003fffffff] [ 0.511224] node 0: [mem 0x0000000100000000-0x0000000fdfffffff] [ 0.514015] node 0: [mem 0x0000010fe0000000-0x00000138ffffffff] [ 0.516851] node 1: [mem 0x0000013900000000-0x000001713fffffff] [ 0.519462] node 2: [mem 0x0000017140000000-0x000001a97fffffff] [ 0.522123] node 3: [mem 0x000001a980000000-0x000001e1bfffffff] [ 0.526518] Zeroed struct page in unavailable ranges: 271 pages [ 0.526520] Initmem setup node 0 [mem 0x0000000000001000-0x00000138ffffffff] [ 0.532041] On node 0 totalpages: 58982129 [ 0.532042] DMA zone: 64 pages used for memmap [ 0.532043] DMA zone: 1346 pages reserved [ 0.532043] DMA zone: 3999 pages, LIFO batch:0 [ 0.532087] DMA32 zone: 4030 pages used for memmap [ 0.532087] DMA32 zone: 257874 pages, LIFO batch:63 [ 0.537684] Normal zone: 917504 pages used for memmap [ 0.537685] Normal zone: 58720256 pages, LIFO batch:63 [ 1.988201] Initmem setup node 1 [mem 0x0000013900000000-0x000001713fffffff] [ 1.991592] On node 1 totalpages: 58982400 [ 1.991593] Normal zone: 921600 pages used for memmap [ 1.991593] Normal zone: 58982400 pages, LIFO batch:63 [ 2.618734] Initmem setup node 2 [mem 0x0000017140000000-0x000001a97fffffff] [ 2.621872] On node 2 totalpages: 58982400 [ 2.621873] Normal zone: 921600 pages used for memmap [ 2.621873] Normal zone: 58982400 pages, LIFO batch:63 [ 3.285564] Initmem setup node 3 [mem 0x000001a980000000-0x000001e1bfffffff] [ 3.288921] On node 3 totalpages: 58982400 [ 3.288922] Normal zone: 921600 pages used for memmap [ 3.288923] Normal zone: 58982400 pages, LIFO batch:63 [ 3.951148] ACPI: PM-Timer IO Port: 0x408 [ 3.953012] ACPI: Local APIC address 0xfee00000 [ 3.953024] ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1]) [ 3.955816] IOAPIC[0]: apic_id 96, version 17, address 0xfec00000, GSI 0-23 [ 3.958564] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 3.961452] ACPI: IRQ9 used by override. [ 3.961454] Using ACPI (MADT) for SMP configuration information [ 3.963823] smpboot: Allowing 96 CPUs, 0 hotplug CPUs [ 3.965884] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 3.968676] PM: Registered nosave memory: [mem 0x000a0000-0x000bffff] [ 3.971414] PM: Registered nosave memory: [mem 0x000c0000-0x000fffff] [ 3.974257] PM: Registered nosave memory: [mem 0x3ed4a000-0x3ed4bfff] [ 3.977061] PM: Registered nosave memory: [mem 0x3ee7b000-0x3ee99fff] [ 3.979722] PM: Registered nosave memory: [mem 0x3eef2000-0x3ef1afff] [ 3.982251] PM: Registered nosave memory: [mem 0x3f29c000-0x3f29cfff] [ 3.984833] PM: Registered nosave memory: [mem 0x3ff9b000-0x3fff2fff] [ 3.987393] PM: Registered nosave memory: [mem 0x3fff3000-0x3fffafff] [ 3.989949] PM: Registered nosave memory: [mem 0x3fffb000-0x3fffefff] [ 3.992518] PM: Registered nosave memory: [mem 0x40000000-0xffffffff] [ 3.995082] PM: Registered nosave memory: [mem 0xfe0000000-0x10fdfffffff] [ 3.997849] [mem 0x40000000-0xffffffff] available for PCI devices [ 4.000422] Booting paravirtualized kernel on Hyper-V [ 4.002622] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 4.006920] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:96 nr_cpu_ids:96 nr_node_ids:4 [ 4.015274] percpu: Embedded 58 pages/cpu s200704 r8192 d28672 u262144 [ 4.018151] pcpu-alloc: s200704 r8192 d28672 u262144 alloc=1*2097152 [ 4.018152] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 [0] 08 09 10 11 12 13 14 15 [ 4.018155] pcpu-alloc: [0] 16 17 18 19 20 21 22 23 [1] 24 25 26 27 28 29 30 31 [ 4.018158] pcpu-alloc: [1] 32 33 34 35 36 37 38 39 [1] 40 41 42 43 44 45 46 47 [ 4.018160] pcpu-alloc: [2] 48 49 50 51 52 53 54 55 [2] 56 57 58 59 60 61 62 63 [ 4.018163] pcpu-alloc: [2] 64 65 66 67 68 69 70 71 [3] 72 73 74 75 76 77 78 79 [ 4.018165] pcpu-alloc: [3] 80 81 82 83 84 85 86 87 [3] 88 89 90 91 92 93 94 95 [ 4.018205] Hyper-V: PV spinlocks enabled [ 4.019968] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes, linear) [ 4.023141] Built 4 zonelists, mobility grouping on. Total pages: 232241585 [ 4.026191] Policy zone: Normal [ 4.027599] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1072-azure root=UUID=175da400-d3a9-4b28-bf4f-85f7ca8784f5 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 [ 4.034007] printk: log_buf_len individual max cpu contribution: 4096 bytes [ 4.036970] printk: log_buf_len total cpu_extra contributions: 389120 bytes [ 4.039980] printk: log_buf_len min size: 262144 bytes [ 4.042363] printk: log_buf_len: 1048576 bytes [ 4.044302] printk: early log buf free: 239720(91%) [ 4.046916] mem auto-init: stack:off, heap alloc:on, heap free:off [ 4.054768] Calgary: detecting Calgary via BIOS EBDA area [ 4.054769] Calgary: Unable to locate Rio Grande table in EBDA - bailing! [ 5.723208] Memory: 928801648K/943717316K available (14340K kernel code, 2313K rwdata, 8832K rodata, 2496K init, 5296K bss, 14915668K reserved, 0K cma-reserved) [ 5.729548] random: get_random_u64 called from __kmem_cache_create+0x41/0x560 with crng_init=0 [ 5.730093] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=96, Nodes=4 [ 5.736600] ftrace: allocating 42078 entries in 165 pages [ 5.752494] rcu: Hierarchical RCU implementation. [ 5.754510] rcu: RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=96. [ 5.757396] Tasks RCU enabled. [ 5.758731] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies. [ 5.761917] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=96 [ 5.767060] Using NULL legacy PIC [ 5.768573] NR_IRQS: 524544, nr_irqs: 1192, preallocated irqs: 0 [ 5.771699] random: crng done (trusting CPU's manufacturer) [ 5.774178] Console: colour dummy device 80x25 [ 5.776428] printk: console [tty1] enabled [ 5.778333] printk: console [ttyS0] enabled [ 5.781929] printk: bootconsole [earlyser0] disabled [ 5.786385] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl [ 5.791352] ACPI: Core revision 20190816 [ 5.793391] Failed to register legacy timer interrupt [ 5.795636] APIC: Switch to symmetric I/O mode setup [ 5.797993] Switched APIC routing to physical flat. [ 5.800363] Hyper-V: Using IPI hypercalls [ 5.802025] Hyper-V: Using enlightened APIC (xapic mode) [ 5.802258] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x233fc7f3f21, max_idle_ns: 440795317890 ns [ 5.808815] Calibrating delay loop (skipped), value calculated using timer frequency.. 4890.81 BogoMIPS (lpj=9781636) [ 5.812815] pid_max: default: 98304 minimum: 768 [ 5.815589] LSM: Security Framework initializing [ 5.816832] Yama: becoming mindful. [ 5.818712] AppArmor: AppArmor initialized [ 5.853559] Dentry cache hash table entries: 33554432 (order: 16, 268435456 bytes, vmalloc) [ 5.873230] Inode-cache hash table entries: 16777216 (order: 15, 134217728 bytes, vmalloc) [ 5.877111] Mount-cache hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc) [ 5.880846] Mountpoint-cache hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc) [ 5.884294] *** VALIDATE tmpfs *** [ 5.885046] *** VALIDATE proc *** [ 5.886645] *** VALIDATE cgroup1 *** [ 5.888265] *** VALIDATE cgroup2 *** [ 5.890438] x86/cpu: User Mode Instruction Prevention (UMIP) activated [ 5.892843] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 512 [ 5.895475] Last level dTLB entries: 4KB 2048, 2MB 2048, 4MB 1024, 1GB 0 [ 5.896819] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization [ 5.900282] Spectre V2 : Mitigation: LFENCE [ 5.900815] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch [ 5.904101] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp [ 5.905044] Freeing SMP alternatives memory: 40K [ 5.907883] smpboot: CPU0: AMD EPYC 7V12 64-Core Processor (family: 0x17, model: 0x31, stepping: 0x0) [ 5.909073] Performance Events: Fam17h+ core perfctr, AMD PMU driver. [ 5.911837] ... version: 0 [ 5.912816] ... bit width: 48 [ 5.914703] ... generic registers: 6 [ 5.916458] ... value mask: 0000ffffffffffff [ 5.916815] ... max period: 00007fffffffffff [ 5.918974] ... fixed-purpose events: 0 [ 5.920722] ... event mask: 000000000000003f [ 5.920931] rcu: Hierarchical SRCU implementation. [ 5.923354] Decoding supported only on Scalable MCA processors. [ 5.924840] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. [ 5.928259] smp: Bringing up secondary CPUs ... [ 5.928905] x86: Booting SMP configuration: [ 5.930627] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 [ 5.944817] .... node #1, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 [ 5.960817] .... node #2, CPUs: #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71 [ 6.060819] .... node #3, CPUs: #72 #73 #74 #75 #76 #77 #78 #79 #80 #81 #82 #83 #84 #85 #86 #87 #88 #89 #90 #91 #92 #93 #94 #95 [ 6.077322] smp: Brought up 4 nodes, 96 CPUs [ 6.082595] smpboot: Max logical packages: 2 [ 6.084378] smpboot: Total of 96 processors activated (469764.55 BogoMIPS) [ 6.131954] devtmpfs: initialized [ 6.132860] x86/mm: Memory block size: 1024MB [ 6.150595] PM: Registering ACPI NVS region [mem 0x3fffb000-0x3fffefff] (16384 bytes) [ 6.152919] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 6.157052] futex hash table entries: 32768 (order: 9, 2097152 bytes, vmalloc) [ 6.160307] pinctrl core: initialized pinctrl subsystem [ 6.161118] PM: RTC time: 16:53:01, date: 2022-04-05 [ 6.163979] NET: Registered protocol family 16 [ 6.164889] audit: initializing netlink subsys (disabled) [ 6.167305] audit: type=2000 audit(1649177581.364:1): state=initialized audit_enabled=0 res=1 [ 6.167305] EISA bus registered [ 6.170199] cpuidle: using governor ladder [ 6.170529] cpuidle: using governor menu [ 6.172946] ACPI: bus type PCI registered [ 6.176817] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 6.218145] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages [ 6.220823] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 6.224976] ACPI: Added _OSI(Module Device) [ 6.226846] ACPI: Added _OSI(Processor Device) [ 6.228817] ACPI: Added _OSI(3.0 _SCP Extensions) [ 6.230877] ACPI: Added _OSI(Processor Aggregator Device) [ 6.232816] ACPI: Added _OSI(Linux-Dell-Video) [ 6.234609] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio) [ 6.236790] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics) [ 6.242177] ACPI: 1 ACPI AML tables successfully acquired and loaded [ 6.246528] ACPI: Interpreter enabled [ 6.248010] ACPI: (supports S0 S5) [ 6.248817] ACPI: Using IOAPIC for interrupt routing [ 6.250847] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 6.253343] ACPI: Enabled 1 GPEs in block 00 to 0F [ 6.258502] iommu: Default domain type: Translated [ 6.258913] SCSI subsystem initialized [ 6.260885] libata version 3.00 loaded. [ 6.260885] vgaarb: loaded [ 6.262037] ACPI: bus type USB registered [ 6.263718] usbcore: registered new interface driver usbfs [ 6.264823] usbcore: registered new interface driver hub [ 6.267054] usbcore: registered new device driver usb [ 6.268828] pps_core: LinuxPPS API ver. 1 registered [ 6.270930] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti [ 6.272819] PTP clock support registered [ 6.274528] EDAC MC: Ver: 3.0.0 [ 6.274528] Registered efivars operations [ 6.281038] hv_vmbus: Vmbus version:5.0 [ 6.284864] PCI: Using ACPI for IRQ routing [ 6.285164] PCI: System does not support PCI [ 6.286852] NetLabel: Initializing [ 6.288291] NetLabel: domain hash size = 128 [ 6.288817] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO [ 6.291320] NetLabel: unlabeled traffic allowed by default [ 6.297288] clocksource: Switched to clocksource tsc-early [ 6.310449] *** VALIDATE bpf *** [ 6.312077] VFS: Disk quotas dquot_6.6.0 [ 6.313914] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 6.316767] *** VALIDATE ramfs *** [ 6.318479] *** VALIDATE hugetlbfs *** [ 6.320189] AppArmor: AppArmor Filesystem Enabled [ 6.322361] pnp: PnP ACPI init [ 6.323667] pnp 00:00: Plug and Play ACPI device, IDs PNP0501 (active) [ 6.323679] pnp 00:01: Plug and Play ACPI device, IDs PNP0501 (active) [ 6.323692] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active) [ 6.323966] pnp: PnP ACPI: found 3 devices [ 6.328274] thermal_sys: Registered thermal governor 'fair_share' [ 6.328275] thermal_sys: Registered thermal governor 'bang_bang' [ 6.330939] thermal_sys: Registered thermal governor 'step_wise' [ 6.333511] thermal_sys: Registered thermal governor 'user_space' [ 6.335933] thermal_sys: Registered thermal governor 'power_allocator' [ 6.343061] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns [ 6.349903] NET: Registered protocol family 2 [ 6.351929] IP idents hash table entries: 262144 (order: 9, 2097152 bytes, vmalloc) [ 6.356734] tcp_listen_portaddr_hash hash table entries: 65536 (order: 8, 1048576 bytes, vmalloc) [ 6.361197] TCP established hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc) [ 6.365162] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes, vmalloc) [ 6.368326] TCP: Hash tables configured (established 524288 bind 65536) [ 6.371596] UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc) [ 6.375552] UDP-Lite hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc) [ 6.379198] NET: Registered protocol family 1 [ 6.381070] NET: Registered protocol family 44 [ 6.382871] PCI: CLS 0 bytes, default 64 [ 6.384519] Trying to unpack rootfs image as initramfs... [ 6.765778] Freeing initrd memory: 29580K [ 6.767568] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 6.770433] software IO TLB: mapped [mem 0x3a822000-0x3e822000] (64MB) [ 6.778430] check: Scanning for low memory corruption every 60 seconds [ 6.782403] Initialise system trusted keyrings [ 6.784220] Key type blacklist registered [ 6.786248] workingset: timestamp_bits=36 max_order=28 bucket_order=0 [ 6.789746] zbud: loaded [ 6.791295] squashfs: version 4.0 (2009/01/31) Phillip Lougher [ 6.793942] fuse: init (API version 7.31) [ 6.795654] *** VALIDATE fuse *** [ 6.797045] *** VALIDATE fuse *** [ 6.798550] Platform Keyring initialized [ 6.801624] Key type asymmetric registered [ 6.803310] Asymmetric key parser 'x509' registered [ 6.805362] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244) [ 6.808495] io scheduler mq-deadline registered [ 6.813259] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 6.816027] hv_vmbus: registering driver hv_pci [ 6.818308] hv_pci 0000000b-0101-0001-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 6.824920] hv_pci 0000000b-0101-0001-3135-423331303142: PCI host bridge to bus 0101:00 [ 6.828190] pci_bus 0101:00: root bus resource [mem 0xfe0000000-0xfe1ffffff window] [ 6.832341] pci 0101:00:00.0: [15b3:101c] type 00 class 0x020700 [ 6.953999] pci 0101:00:00.0: reg 0x10: [mem 0xfe0000000-0xfe1ffffff 64bit pref] [ 7.500540] pci 0101:00:00.0: enabling Extended Tags [ 7.509239] pci 0101:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0101:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 7.521519] pci 0101:00:00.0: BAR 0: assigned [mem 0xfe0000000-0xfe1ffffff 64bit pref] [ 7.641680] hv_pci 00000015-0102-0001-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 7.648301] hv_pci 00000015-0102-0001-3135-423331303142: PCI host bridge to bus 0102:00 [ 7.651678] pci_bus 0102:00: root bus resource [mem 0xfe2000000-0xfe3ffffff window] [ 7.655729] pci 0102:00:00.0: [15b3:101c] type 00 class 0x020700 [ 7.766421] pci 0102:00:00.0: reg 0x10: [mem 0xfe2000000-0xfe3ffffff 64bit pref] [ 7.780841] tsc: Refined TSC clocksource calibration: 2445.408 MHz [ 7.783696] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x233fc6bd520, max_idle_ns: 440795230742 ns [ 8.312996] pci 0102:00:00.0: enabling Extended Tags [ 8.317725] clocksource: Switched to clocksource tsc [ 8.321825] pci 0102:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0102:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 8.334046] pci 0102:00:00.0: BAR 0: assigned [mem 0xfe2000000-0xfe3ffffff 64bit pref] [ 8.453993] hv_pci 0000004b-0103-0000-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 8.460846] hv_pci 0000004b-0103-0000-3135-423331303142: PCI host bridge to bus 0103:00 [ 8.464190] pci_bus 0103:00: root bus resource [mem 0xfe4000000-0xfe5ffffff window] [ 8.468589] pci 0103:00:00.0: [15b3:101c] type 00 class 0x020700 [ 8.579233] pci 0103:00:00.0: reg 0x10: [mem 0xfe4000000-0xfe5ffffff 64bit pref] [ 9.126044] pci 0103:00:00.0: enabling Extended Tags [ 9.135695] pci 0103:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0103:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 9.147868] pci 0103:00:00.0: BAR 0: assigned [mem 0xfe4000000-0xfe5ffffff 64bit pref] [ 9.266922] hv_pci 00000055-0104-0000-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 9.273509] hv_pci 00000055-0104-0000-3135-423331303142: PCI host bridge to bus 0104:00 [ 9.276660] pci_bus 0104:00: root bus resource [mem 0xfe6000000-0xfe7ffffff window] [ 9.280890] pci 0104:00:00.0: [15b3:101c] type 00 class 0x020700 [ 9.392421] pci 0104:00:00.0: reg 0x10: [mem 0xfe6000000-0xfe7ffffff 64bit pref] [ 9.938247] pci 0104:00:00.0: enabling Extended Tags [ 9.948012] pci 0104:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0104:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 9.959854] pci 0104:00:00.0: BAR 0: assigned [mem 0xfe6000000-0xfe7ffffff 64bit pref] [ 10.079083] hv_pci 0000008b-0105-0003-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 10.085734] hv_pci 0000008b-0105-0003-3135-423331303142: PCI host bridge to bus 0105:00 [ 10.088969] pci_bus 0105:00: root bus resource [mem 0xfe8000000-0xfe9ffffff window] [ 10.092822] pci 0105:00:00.0: [15b3:101c] type 00 class 0x020700 [ 10.204066] pci 0105:00:00.0: reg 0x10: [mem 0xfe8000000-0xfe9ffffff 64bit pref] [ 10.750765] pci 0105:00:00.0: enabling Extended Tags [ 10.760198] pci 0105:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0105:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 10.772332] pci 0105:00:00.0: BAR 0: assigned [mem 0xfe8000000-0xfe9ffffff 64bit pref] [ 10.891857] hv_pci 00000095-0106-0003-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 10.898312] hv_pci 00000095-0106-0003-3135-423331303142: PCI host bridge to bus 0106:00 [ 10.901685] pci_bus 0106:00: root bus resource [mem 0xfea000000-0xfebffffff window] [ 10.905866] pci 0106:00:00.0: [15b3:101c] type 00 class 0x020700 [ 11.016559] pci 0106:00:00.0: reg 0x10: [mem 0xfea000000-0xfebffffff 64bit pref] [ 11.563545] pci 0106:00:00.0: enabling Extended Tags [ 11.573683] pci 0106:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0106:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 11.585665] pci 0106:00:00.0: BAR 0: assigned [mem 0xfea000000-0xfebffffff 64bit pref] [ 11.704250] hv_pci 000000cb-0107-0002-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 11.711040] hv_pci 000000cb-0107-0002-3135-423331303142: PCI host bridge to bus 0107:00 [ 11.714483] pci_bus 0107:00: root bus resource [mem 0xfec000000-0xfedffffff window] [ 11.718725] pci 0107:00:00.0: [15b3:101c] type 00 class 0x020700 [ 11.829201] pci 0107:00:00.0: reg 0x10: [mem 0xfec000000-0xfedffffff 64bit pref] [ 12.375649] pci 0107:00:00.0: enabling Extended Tags [ 12.384991] pci 0107:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0107:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 12.396898] pci 0107:00:00.0: BAR 0: assigned [mem 0xfec000000-0xfedffffff 64bit pref] [ 12.516732] hv_pci 000000d5-0108-0002-3135-423331303142: PCI VMBus probing: Using version 0x10003 [ 12.523541] hv_pci 000000d5-0108-0002-3135-423331303142: PCI host bridge to bus 0108:00 [ 12.526997] pci_bus 0108:00: root bus resource [mem 0xfee000000-0xfefffffff window] [ 12.531047] pci 0108:00:00.0: [15b3:101c] type 00 class 0x020700 [ 12.641637] pci 0108:00:00.0: reg 0x10: [mem 0xfee000000-0xfefffffff 64bit pref] [ 13.188206] pci 0108:00:00.0: enabling Extended Tags [ 13.197610] pci 0108:00:00.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 0108:00:00.0 (capable of 252.048 Gb/s with 16 GT/s x16 link) [ 13.209731] pci 0108:00:00.0: BAR 0: assigned [mem 0xfee000000-0xfefffffff 64bit pref] [ 13.329055] hv_pci 0000000e-0001-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 14.197031] hv_pci 0000000e-0001-0000-3130-444500000000: PCI host bridge to bus 0001:00 [ 14.200494] pci_bus 0001:00: root bus resource [mem 0x41000000-0x41ffffff window] [ 14.203883] pci_bus 0001:00: root bus resource [mem 0x1000000000-0x2001ffffff window] [ 14.208081] pci 0001:00:00.0: [10de:20b0] type 00 class 0x030200 [ 15.129620] pci 0001:00:00.0: reg 0x10: [mem 0x41000000-0x41ffffff] [ 16.050999] pci 0001:00:00.0: reg 0x14: [mem 0x1000000000-0x1fffffffff 64bit pref] [ 16.973496] pci 0001:00:00.0: reg 0x1c: [mem 0x2000000000-0x2001ffffff 64bit pref] [ 18.812165] pci 0001:00:00.0: Enabling HDA controller [ 18.820657] pci 0001:00:00.0: PME# supported from D0 D3hot [ 18.832475] pci 0001:00:00.0: BAR 1: assigned [mem 0x1000000000-0x1fffffffff 64bit pref] [ 19.772652] pci 0001:00:00.0: BAR 3: assigned [mem 0x2000000000-0x2001ffffff 64bit pref] [ 20.696595] pci 0001:00:00.0: BAR 0: assigned [mem 0x41000000-0x41ffffff] [ 20.700598] hv_pci 00000013-0002-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 21.549853] hv_pci 00000013-0002-0000-3130-444500000000: PCI host bridge to bus 0002:00 [ 21.553343] pci_bus 0002:00: root bus resource [mem 0x42000000-0x42ffffff window] [ 21.556362] pci_bus 0002:00: root bus resource [mem 0x3000000000-0x4001ffffff window] [ 21.560483] pci 0002:00:00.0: [10de:20b0] type 00 class 0x030200 [ 22.489689] pci 0002:00:00.0: reg 0x10: [mem 0x42000000-0x42ffffff] [ 23.412661] pci 0002:00:00.0: reg 0x14: [mem 0x3000000000-0x3fffffffff 64bit pref] [ 24.326129] pci 0002:00:00.0: reg 0x1c: [mem 0x4000000000-0x4001ffffff 64bit pref] [ 26.166067] pci 0002:00:00.0: Enabling HDA controller [ 26.174329] pci 0002:00:00.0: PME# supported from D0 D3hot [ 26.185904] pci 0002:00:00.0: BAR 1: assigned [mem 0x3000000000-0x3fffffffff 64bit pref] [ 27.109569] pci 0002:00:00.0: BAR 3: assigned [mem 0x4000000000-0x4001ffffff 64bit pref] [ 28.038306] pci 0002:00:00.0: BAR 0: assigned [mem 0x42000000-0x42ffffff] [ 28.042304] hv_pci 0000004e-0003-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 28.906099] hv_pci 0000004e-0003-0000-3130-444500000000: PCI host bridge to bus 0003:00 [ 28.909634] pci_bus 0003:00: root bus resource [mem 0x43000000-0x43ffffff window] [ 28.912806] pci_bus 0003:00: root bus resource [mem 0x5000000000-0x6001ffffff window] [ 28.916998] pci 0003:00:00.0: [10de:20b0] type 00 class 0x030200 [ 29.845993] pci 0003:00:00.0: reg 0x10: [mem 0x43000000-0x43ffffff] [ 30.773094] pci 0003:00:00.0: reg 0x14: [mem 0x5000000000-0x5fffffffff 64bit pref] [ 31.692941] pci 0003:00:00.0: reg 0x1c: [mem 0x6000000000-0x6001ffffff 64bit pref] [ 33.544490] pci 0003:00:00.0: Enabling HDA controller [ 33.553152] pci 0003:00:00.0: PME# supported from D0 D3hot [ 33.564873] pci 0003:00:00.0: BAR 1: assigned [mem 0x5000000000-0x5fffffffff 64bit pref] [ 34.492553] pci 0003:00:00.0: BAR 3: assigned [mem 0x6000000000-0x6001ffffff 64bit pref] [ 35.419402] pci 0003:00:00.0: BAR 0: assigned [mem 0x43000000-0x43ffffff] [ 35.423776] hv_pci 00000053-0004-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 36.292066] hv_pci 00000053-0004-0000-3130-444500000000: PCI host bridge to bus 0004:00 [ 36.295717] pci_bus 0004:00: root bus resource [mem 0x44000000-0x44ffffff window] [ 36.298950] pci_bus 0004:00: root bus resource [mem 0x7000000000-0x8001ffffff window] [ 36.303072] pci 0004:00:00.0: [10de:20b0] type 00 class 0x030200 [ 37.236192] pci 0004:00:00.0: reg 0x10: [mem 0x44000000-0x44ffffff] [ 38.168025] pci 0004:00:00.0: reg 0x14: [mem 0x7000000000-0x7fffffffff 64bit pref] [ 39.093449] pci 0004:00:00.0: reg 0x1c: [mem 0x8000000000-0x8001ffffff 64bit pref] [ 40.955520] pci 0004:00:00.0: Enabling HDA controller [ 40.963876] pci 0004:00:00.0: PME# supported from D0 D3hot [ 40.975790] pci 0004:00:00.0: BAR 1: assigned [mem 0x7000000000-0x7fffffffff 64bit pref] [ 41.909341] pci 0004:00:00.0: BAR 3: assigned [mem 0x8000000000-0x8001ffffff 64bit pref] [ 42.842594] pci 0004:00:00.0: BAR 0: assigned [mem 0x44000000-0x44ffffff] [ 42.846883] hv_pci 00000065-0005-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 42.893903] hv_pci 00000065-0005-0000-3130-444500000000: PCI host bridge to bus 0005:00 [ 42.897358] pci_bus 0005:00: root bus resource [mem 0x46000000-0x47ffffff window] [ 42.901342] pci 0005:00:00.0: [10de:1af1] type 00 class 0x068000 [ 42.910917] pci 0005:00:00.0: reg 0x10: [mem 0x46000000-0x47ffffff] [ 42.952279] pci 0005:00:00.0: PME# supported from D0 D3hot [ 42.963853] pci 0005:00:00.0: BAR 0: assigned [mem 0x46000000-0x47ffffff] [ 42.967784] hv_pci 00000066-0006-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 43.018925] hv_pci 00000066-0006-0000-3130-444500000000: PCI host bridge to bus 0006:00 [ 43.022440] pci_bus 0006:00: root bus resource [mem 0x48000000-0x49ffffff window] [ 43.026381] pci 0006:00:00.0: [10de:1af1] type 00 class 0x068000 [ 43.035971] pci 0006:00:00.0: reg 0x10: [mem 0x48000000-0x49ffffff] [ 43.078569] pci 0006:00:00.0: PME# supported from D0 D3hot [ 43.090009] pci 0006:00:00.0: BAR 0: assigned [mem 0x48000000-0x49ffffff] [ 43.093752] hv_pci 00000067-0007-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 43.143617] hv_pci 00000067-0007-0000-3130-444500000000: PCI host bridge to bus 0007:00 [ 43.147260] pci_bus 0007:00: root bus resource [mem 0x4a000000-0x4bffffff window] [ 43.151249] pci 0007:00:00.0: [10de:1af1] type 00 class 0x068000 [ 43.160969] pci 0007:00:00.0: reg 0x10: [mem 0x4a000000-0x4bffffff] [ 43.202375] pci 0007:00:00.0: PME# supported from D0 D3hot [ 43.213839] pci 0007:00:00.0: BAR 0: assigned [mem 0x4a000000-0x4bffffff] [ 43.217493] hv_pci 00000068-0008-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 43.253026] hv_pci 00000068-0008-0000-3130-444500000000: PCI host bridge to bus 0008:00 [ 43.256449] pci_bus 0008:00: root bus resource [mem 0x4c000000-0x4dffffff window] [ 43.260474] pci 0008:00:00.0: [10de:1af1] type 00 class 0x068000 [ 43.270209] pci 0008:00:00.0: reg 0x10: [mem 0x4c000000-0x4dffffff] [ 43.312970] pci 0008:00:00.0: PME# supported from D0 D3hot [ 43.324524] pci 0008:00:00.0: BAR 0: assigned [mem 0x4c000000-0x4dffffff] [ 43.328290] hv_pci 00000069-0009-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 43.378104] hv_pci 00000069-0009-0000-3130-444500000000: PCI host bridge to bus 0009:00 [ 43.381550] pci_bus 0009:00: root bus resource [mem 0x4e000000-0x4fffffff window] [ 43.385366] pci 0009:00:00.0: [10de:1af1] type 00 class 0x068000 [ 43.394916] pci 0009:00:00.0: reg 0x10: [mem 0x4e000000-0x4fffffff] [ 43.436901] pci 0009:00:00.0: PME# supported from D0 D3hot [ 43.448429] pci 0009:00:00.0: BAR 0: assigned [mem 0x4e000000-0x4fffffff] [ 43.452233] hv_pci 0000006a-000a-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 43.503436] hv_pci 0000006a-000a-0000-3130-444500000000: PCI host bridge to bus 000a:00 [ 43.507176] pci_bus 000a:00: root bus resource [mem 0x50000000-0x51ffffff window] [ 43.511262] pci 000a:00:00.0: [10de:1af1] type 00 class 0x068000 [ 43.521038] pci 000a:00:00.0: reg 0x10: [mem 0x50000000-0x51ffffff] [ 43.563848] pci 000a:00:00.0: PME# supported from D0 D3hot [ 43.576721] pci 000a:00:00.0: BAR 0: assigned [mem 0x50000000-0x51ffffff] [ 43.580604] hv_pci 0000008e-000b-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 44.462360] hv_pci 0000008e-000b-0000-3130-444500000000: PCI host bridge to bus 000b:00 [ 44.465901] pci_bus 000b:00: root bus resource [mem 0x45000000-0x45ffffff window] [ 44.469223] pci_bus 000b:00: root bus resource [mem 0x9000000000-0xa001ffffff window] [ 44.473425] pci 000b:00:00.0: [10de:20b0] type 00 class 0x030200 [ 45.419967] pci 000b:00:00.0: reg 0x10: [mem 0x45000000-0x45ffffff] [ 46.367545] pci 000b:00:00.0: reg 0x14: [mem 0x9000000000-0x9fffffffff 64bit pref] [ 47.312322] pci 000b:00:00.0: reg 0x1c: [mem 0xa000000000-0xa001ffffff 64bit pref] [ 49.196732] pci 000b:00:00.0: Enabling HDA controller [ 49.205599] pci 000b:00:00.0: PME# supported from D0 D3hot [ 49.217375] pci 000b:00:00.0: BAR 1: assigned [mem 0x9000000000-0x9fffffffff 64bit pref] [ 50.149394] pci 000b:00:00.0: BAR 3: assigned [mem 0xa000000000-0xa001ffffff 64bit pref] [ 51.096847] pci 000b:00:00.0: BAR 0: assigned [mem 0x45000000-0x45ffffff] [ 51.101015] hv_pci 00000094-000c-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 51.990512] hv_pci 00000094-000c-0000-3130-444500000000: PCI host bridge to bus 000c:00 [ 51.994094] pci_bus 000c:00: root bus resource [mem 0x52000000-0x52ffffff window] [ 51.997178] pci_bus 000c:00: root bus resource [mem 0xb000000000-0xc001ffffff window] [ 52.001229] pci 000c:00:00.0: [10de:20b0] type 00 class 0x030200 [ 52.953983] pci 000c:00:00.0: reg 0x10: [mem 0x52000000-0x52ffffff] [ 53.906087] pci 000c:00:00.0: reg 0x14: [mem 0xb000000000-0xbfffffffff 64bit pref] [ 54.855949] pci 000c:00:00.0: reg 0x1c: [mem 0xc000000000-0xc001ffffff 64bit pref] [ 56.758497] pci 000c:00:00.0: Enabling HDA controller [ 56.767165] pci 000c:00:00.0: PME# supported from D0 D3hot [ 56.779136] pci 000c:00:00.0: BAR 1: assigned [mem 0xb000000000-0xbfffffffff 64bit pref] [ 57.732483] pci 000c:00:00.0: BAR 3: assigned [mem 0xc000000000-0xc001ffffff 64bit pref] [ 58.687983] pci 000c:00:00.0: BAR 0: assigned [mem 0x52000000-0x52ffffff] [ 58.692599] hv_pci 000000ce-000d-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 59.587232] hv_pci 000000ce-000d-0000-3130-444500000000: PCI host bridge to bus 000d:00 [ 59.591183] pci_bus 000d:00: root bus resource [mem 0x53000000-0x53ffffff window] [ 59.594576] pci_bus 000d:00: root bus resource [mem 0xd000000000-0xe001ffffff window] [ 59.598751] pci 000d:00:00.0: [10de:20b0] type 00 class 0x030200 [ 60.559439] pci 000d:00:00.0: reg 0x10: [mem 0x53000000-0x53ffffff] [ 61.514193] pci 000d:00:00.0: reg 0x14: [mem 0xd000000000-0xdfffffffff 64bit pref] [ 62.469625] pci 000d:00:00.0: reg 0x1c: [mem 0xe000000000-0xe001ffffff 64bit pref] [ 64.378165] pci 000d:00:00.0: Enabling HDA controller [ 64.386831] pci 000d:00:00.0: PME# supported from D0 D3hot [ 64.398640] pci 000d:00:00.0: BAR 1: assigned [mem 0xd000000000-0xdfffffffff 64bit pref] [ 65.354341] pci 000d:00:00.0: BAR 3: assigned [mem 0xe000000000-0xe001ffffff 64bit pref] [ 66.306967] pci 000d:00:00.0: BAR 0: assigned [mem 0x53000000-0x53ffffff] [ 66.311228] hv_pci 000000d4-000e-0000-3130-444500000000: PCI VMBus probing: Using version 0x10003 [ 67.274318] hv_pci 000000d4-000e-0000-3130-444500000000: PCI host bridge to bus 000e:00 [ 67.277960] pci_bus 000e:00: root bus resource [mem 0x54000000-0x54ffffff window] [ 67.281240] pci_bus 000e:00: root bus resource [mem 0xf000000000-0x10001ffffff window] [ 67.285397] pci 000e:00:00.0: [10de:20b0] type 00 class 0x030200 [ 68.252369] pci 000e:00:00.0: reg 0x10: [mem 0x54000000-0x54ffffff] [ 69.213292] pci 000e:00:00.0: reg 0x14: [mem 0xf000000000-0xffffffffff 64bit pref] [ 70.173122] pci 000e:00:00.0: reg 0x1c: [mem 0x10000000000-0x10001ffffff 64bit pref] [ 72.081049] pci 000e:00:00.0: Enabling HDA controller [ 72.089499] pci 000e:00:00.0: PME# supported from D0 D3hot [ 72.101427] pci 000e:00:00.0: BAR 1: assigned [mem 0xf000000000-0xffffffffff 64bit pref] [ 73.063373] pci 000e:00:00.0: BAR 3: assigned [mem 0x10000000000-0x10001ffffff 64bit pref] [ 74.025597] pci 000e:00:00.0: BAR 0: assigned [mem 0x54000000-0x54ffffff] [ 74.029989] hv_pci 981908b7-f4ab-466c-b739-a4b84b3c7b34: PCI VMBus probing: Using version 0x10003 [ 74.046788] hv_pci 981908b7-f4ab-466c-b739-a4b84b3c7b34: PCI host bridge to bus f4ab:00 [ 74.050318] pci_bus f4ab:00: root bus resource [mem 0xff0000000-0xff000bfff window] [ 74.054286] pci f4ab:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.070982] pci f4ab:00:00.0: reg 0x10: [mem 0xff0008000-0xff000bfff 64bit] [ 74.112150] pci f4ab:00:00.0: reg 0x20: [mem 0xff0000000-0xff0007fff 64bit] [ 74.140974] pci f4ab:00:00.0: BAR 4: assigned [mem 0xff0000000-0xff0007fff 64bit] [ 74.157208] pci f4ab:00:00.0: BAR 0: assigned [mem 0xff0008000-0xff000bfff 64bit] [ 74.173391] hv_pci 0495eec3-daa3-464f-ad1d-668fa17619af: PCI VMBus probing: Using version 0x10003 [ 74.190584] hv_pci 0495eec3-daa3-464f-ad1d-668fa17619af: PCI host bridge to bus daa3:00 [ 74.194264] pci_bus daa3:00: root bus resource [mem 0xff0010000-0xff001bfff window] [ 74.198229] pci daa3:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.214879] pci daa3:00:00.0: reg 0x10: [mem 0xff0018000-0xff001bfff 64bit] [ 74.256065] pci daa3:00:00.0: reg 0x20: [mem 0xff0010000-0xff0017fff 64bit] [ 74.284850] pci daa3:00:00.0: BAR 4: assigned [mem 0xff0010000-0xff0017fff 64bit] [ 74.300681] pci daa3:00:00.0: BAR 0: assigned [mem 0xff0018000-0xff001bfff 64bit] [ 74.316848] hv_pci 7622ea16-858c-414e-983a-a8a11dca103b: PCI VMBus probing: Using version 0x10003 [ 74.334401] hv_pci 7622ea16-858c-414e-983a-a8a11dca103b: PCI host bridge to bus 858c:00 [ 74.338055] pci_bus 858c:00: root bus resource [mem 0xff0020000-0xff002bfff window] [ 74.342210] pci 858c:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.358579] pci 858c:00:00.0: reg 0x10: [mem 0xff0028000-0xff002bfff 64bit] [ 74.399335] pci 858c:00:00.0: reg 0x20: [mem 0xff0020000-0xff0027fff 64bit] [ 74.428068] pci 858c:00:00.0: BAR 4: assigned [mem 0xff0020000-0xff0027fff 64bit] [ 74.443925] pci 858c:00:00.0: BAR 0: assigned [mem 0xff0028000-0xff002bfff 64bit] [ 74.460045] hv_pci cbb784cf-39f2-49c1-be36-27dd433a35d0: PCI VMBus probing: Using version 0x10003 [ 74.477742] hv_pci cbb784cf-39f2-49c1-be36-27dd433a35d0: PCI host bridge to bus 39f2:00 [ 74.481300] pci_bus 39f2:00: root bus resource [mem 0xff0030000-0xff003bfff window] [ 74.485423] pci 39f2:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.501880] pci 39f2:00:00.0: reg 0x10: [mem 0xff0038000-0xff003bfff 64bit] [ 74.542750] pci 39f2:00:00.0: reg 0x20: [mem 0xff0030000-0xff0037fff 64bit] [ 74.571414] pci 39f2:00:00.0: BAR 4: assigned [mem 0xff0030000-0xff0037fff 64bit] [ 74.587117] pci 39f2:00:00.0: BAR 0: assigned [mem 0xff0038000-0xff003bfff 64bit] [ 74.603661] hv_pci f4e02db3-c4c2-4a61-81b1-df1dc9d6d290: PCI VMBus probing: Using version 0x10003 [ 74.621338] hv_pci f4e02db3-c4c2-4a61-81b1-df1dc9d6d290: PCI host bridge to bus c4c2:00 [ 74.625042] pci_bus c4c2:00: root bus resource [mem 0xff0040000-0xff004bfff window] [ 74.629312] pci c4c2:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.645979] pci c4c2:00:00.0: reg 0x10: [mem 0xff0048000-0xff004bfff 64bit] [ 74.687230] pci c4c2:00:00.0: reg 0x20: [mem 0xff0040000-0xff0047fff 64bit] [ 74.715822] pci c4c2:00:00.0: BAR 4: assigned [mem 0xff0040000-0xff0047fff 64bit] [ 74.731506] pci c4c2:00:00.0: BAR 0: assigned [mem 0xff0048000-0xff004bfff 64bit] [ 74.747562] hv_pci e48ab07f-b170-46d7-8d18-a89b73055c2a: PCI VMBus probing: Using version 0x10003 [ 74.764978] hv_pci e48ab07f-b170-46d7-8d18-a89b73055c2a: PCI host bridge to bus b170:00 [ 74.768356] pci_bus b170:00: root bus resource [mem 0xff0050000-0xff005bfff window] [ 74.772369] pci b170:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.788717] pci b170:00:00.0: reg 0x10: [mem 0xff0058000-0xff005bfff 64bit] [ 74.831186] pci b170:00:00.0: reg 0x20: [mem 0xff0050000-0xff0057fff 64bit] [ 74.860029] pci b170:00:00.0: BAR 4: assigned [mem 0xff0050000-0xff0057fff 64bit] [ 74.876141] pci b170:00:00.0: BAR 0: assigned [mem 0xff0058000-0xff005bfff 64bit] [ 74.892362] hv_pci 3aad94de-c3f8-4d91-8623-c1390ed39e24: PCI VMBus probing: Using version 0x10003 [ 74.909485] hv_pci 3aad94de-c3f8-4d91-8623-c1390ed39e24: PCI host bridge to bus c3f8:00 [ 74.913202] pci_bus c3f8:00: root bus resource [mem 0xff0060000-0xff006bfff window] [ 74.917271] pci c3f8:00:00.0: [1414:b111] type 00 class 0x010802 [ 74.934235] pci c3f8:00:00.0: reg 0x10: [mem 0xff0068000-0xff006bfff 64bit] [ 74.975782] pci c3f8:00:00.0: reg 0x20: [mem 0xff0060000-0xff0067fff 64bit] [ 75.005055] pci c3f8:00:00.0: BAR 4: assigned [mem 0xff0060000-0xff0067fff 64bit] [ 75.020973] pci c3f8:00:00.0: BAR 0: assigned [mem 0xff0068000-0xff006bfff 64bit] [ 75.037193] hv_pci 649d3fa9-0d7f-4705-838f-f166ef3f8708: PCI VMBus probing: Using version 0x10003 [ 75.054231] hv_pci 649d3fa9-0d7f-4705-838f-f166ef3f8708: PCI host bridge to bus 0d7f:00 [ 75.057826] pci_bus 0d7f:00: root bus resource [mem 0xff0070000-0xff007bfff window] [ 75.062072] pci 0d7f:00:00.0: [1414:b111] type 00 class 0x010802 [ 75.078797] pci 0d7f:00:00.0: reg 0x10: [mem 0xff0078000-0xff007bfff 64bit] [ 75.119618] pci 0d7f:00:00.0: reg 0x20: [mem 0xff0070000-0xff0077fff 64bit] [ 75.148413] pci 0d7f:00:00.0: BAR 4: assigned [mem 0xff0070000-0xff0077fff 64bit] [ 75.164259] pci 0d7f:00:00.0: BAR 0: assigned [mem 0xff0078000-0xff007bfff 64bit] [ 75.180159] efifb: probing for efifb [ 75.181944] efifb: framebuffer at 0x40000000, using 3072k, total 3072k [ 75.184737] efifb: mode is 1024x768x32, linelength=4096, pages=1 [ 75.187444] efifb: scrolling: redraw [ 75.189080] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 [ 75.194247] Console: switching to colour frame buffer device 128x48 [ 75.198287] fb0: EFI VGA frame buffer device [ 75.202659] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled [ 75.230084] 00:00: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 75.257879] 00:01: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A [ 75.262580] Linux agpgart interface v0.103 [ 75.412845] loop: module loaded [ 75.414371] hv_vmbus: registering driver hv_storvsc [ 75.417064] nvme nvme0: pci function f4ab:00:00.0 [ 75.418450] scsi host0: storvsc_host_t [ 75.419587] nvme nvme1: pci function daa3:00:00.0 [ 75.419754] nvme f4ab:00:00.0: can't derive routing for PCI INT A [ 75.419755] nvme f4ab:00:00.0: PCI INT A: no GSI [ 75.420784] scsi host1: storvsc_host_t [ 75.431133] nvme nvme2: pci function 858c:00:00.0 [ 75.431242] nvme daa3:00:00.0: can't derive routing for PCI INT A [ 75.433383] nvme nvme3: pci function 39f2:00:00.0 [ 75.433574] nvme 858c:00:00.0: can't derive routing for PCI INT A [ 75.433576] nvme 858c:00:00.0: PCI INT A: no GSI [ 75.436235] nvme daa3:00:00.0: PCI INT A: no GSI [ 75.447310] nvme nvme4: pci function c4c2:00:00.0 [ 75.447516] nvme 39f2:00:00.0: can't derive routing for PCI INT A [ 75.449741] nvme nvme5: pci function b170:00:00.0 [ 75.449972] nvme c4c2:00:00.0: can't derive routing for PCI INT A [ 75.449973] nvme c4c2:00:00.0: PCI INT A: no GSI [ 75.452705] nvme 39f2:00:00.0: PCI INT A: no GSI [ 75.466056] nvme nvme6: pci function c3f8:00:00.0 [ 75.466207] nvme b170:00:00.0: can't derive routing for PCI INT A [ 75.469044] nvme nvme7: pci function 0d7f:00:00.0 [ 75.469285] nvme c3f8:00:00.0: can't derive routing for PCI INT A [ 75.469287] nvme c3f8:00:00.0: PCI INT A: no GSI [ 75.472419] nvme b170:00:00.0: PCI INT A: no GSI [ 75.482955] nvme nvme0: Shutdown timeout set to 8 seconds [ 75.486538] libphy: Fixed MDIO Bus: probed [ 75.486745] nvme 0d7f:00:00.0: can't derive routing for PCI INT A [ 75.486746] nvme 0d7f:00:00.0: PCI INT A: no GSI [ 75.498669] nvme nvme1: Shutdown timeout set to 8 seconds [ 75.498686] nvme nvme2: Shutdown timeout set to 8 seconds [ 75.499659] tun: Universal TUN/TAP device driver, 1.6 [ 75.509281] PPP generic driver version 2.4.2 [ 75.511898] VFIO - User Level meta-driver version: 0.3 [ 75.514183] nvme nvme3: Shutdown timeout set to 8 seconds [ 75.514240] nvme nvme4: Shutdown timeout set to 8 seconds [ 75.515314] i8042: PNP: No PS/2 controller found. [ 75.524293] mousedev: PS/2 mouse device common for all mice [ 75.527951] rtc_cmos 00:02: RTC can wake from S4 [ 75.529841] nvme nvme6: Shutdown timeout set to 8 seconds [ 75.529938] nvme nvme5: Shutdown timeout set to 8 seconds [ 75.545449] nvme nvme7: Shutdown timeout set to 8 seconds [ 75.554022] rtc_cmos 00:02: registered as rtc0 [ 75.556632] rtc_cmos 00:02: alarms up to one month, 114 bytes nvram [ 75.560473] device-mapper: uevent: version 1.0.3 [ 75.563440] device-mapper: ioctl: 4.41.0-ioctl (2019-09-16) initialised: dm-devel@redhat.com [ 75.568476] platform eisa.0: Probing EISA bus 0 [ 75.571474] platform eisa.0: EISA: Detected 0 cards [ 75.574345] EFI Variables Facility v0.08 2004-May-17 [ 75.583788] scsi 0:0:0:0: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 5 [ 75.585549] drop_monitor: Initializing network drop monitor service [ 75.589416] scsi 0:0:0:1: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 5 [ 75.592372] NET: Registered protocol family 10 [ 75.597869] scsi 0:0:0:2: CD-ROM Msft Virtual DVD-ROM 1.0 PQ: 0 ANSI: 0 [ 75.603553] Segment Routing with IPv6 [ 75.605864] NET: Registered protocol family 17 [ 75.608391] Key type dns_resolver registered [ 75.616090] RAS: Correctable Errors collector initialized. [ 75.619121] IPI shorthand broadcast: enabled [ 75.621523] sched_clock: Marking stable (75592246121, 29266100)->(75684668700, -63156479) [ 75.622879] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 75.627575] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB) [ 75.630182] scsi 0:0:0:1: Attached scsi generic sg1 type 0 [ 75.633445] sd 0:0:0:1: [sdb] 6081740800 512-byte logical blocks: (3.11 TB/2.83 TiB) [ 75.633447] sd 0:0:0:1: [sdb] 4096-byte physical blocks [ 75.633485] sd 0:0:0:1: [sdb] Write Protect is off [ 75.633486] sd 0:0:0:1: [sdb] Mode Sense: 0f 00 10 00 [ 75.633559] sd 0:0:0:1: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 75.633949] sd 0:0:0:0: [sda] 4096-byte physical blocks [ 75.636878] registered taskstats version 1 [ 75.636985] sr 0:0:0:2: [sr0] scsi-1 drive [ 75.636987] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 75.637145] sdb: sdb1 [ 75.643724] sd 0:0:0:0: [sda] Write Protect is off [ 75.646419] Loading compiled-in X.509 certificates [ 75.647055] Loaded X.509 cert 'Build time autogenerated kernel key: d4cf42da249c8ddd4ece66c768885338dca13669' [ 75.650742] sd 0:0:0:0: [sda] Mode Sense: 0f 00 10 00 [ 75.653970] Loaded X.509 cert 'Canonical Ltd. Live Patch Signing: 14df34d1a87cf37625abec039ef2bf521249b969' [ 75.661103] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 75.661304] sd 0:0:0:1: [sdb] Attached SCSI disk [ 75.663161] Loaded X.509 cert 'Canonical Ltd. Kernel Module Signing: 88f752e560a1e0737e31163a466ad7b70a850c19' [ 75.690900] blacklist: Loading compiled-in revocation X.509 certificates [ 75.694300] Loaded X.509 cert 'Canonical Ltd. Secure Boot Signing: 61482aa2830d0ab2ad5af10b7250da9033ddcef0' [ 75.699447] zswap: loaded using pool lzo/zbud [ 75.702264] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 75.706166] GPT:62916607 != 268435455 [ 75.708513] GPT:Alternate GPT header not at the end of the disk. [ 75.708940] Key type ._fscrypt registered [ 75.711775] GPT:62916607 != 268435455 [ 75.712879] sr 0:0:0:2: Attached scsi CD-ROM sr0 [ 75.712944] sr 0:0:0:2: Attached scsi generic sg2 type 5 [ 75.714175] Key type .fscrypt registered [ 75.716393] GPT: Use GNU Parted to correct GPT errors. [ 75.716399] sda: sda1 sda14 sda15 [ 75.724033] Key type big_key registered [ 75.732128] Key type encrypted registered [ 75.734826] AppArmor: AppArmor sha1 policy hashing enabled [ 75.738252] integrity: Loading X.509 certificate: UEFI:db [ 75.741161] integrity: Loaded X.509 cert 'Microsoft Windows Production PCA 2011: a92902398e16c49778cd90f99e4f9ae17c55af53' [ 75.746957] integrity: Loading X.509 certificate: UEFI:MokListRT (MOKvar table) [ 75.751188] integrity: Loaded X.509 cert 'Canonical Ltd. Master Certificate Authority: ad91990bc22ab1f517048c23b6655a268e345a63' [ 75.757418] ima: No TPM chip found, activating TPM-bypass! [ 75.760412] ima: Allocated hash algorithm: sha1 [ 75.763133] ima: No architecture policies found [ 75.767228] evm: Initialising EVM extended attributes: [ 75.770068] evm: security.selinux [ 75.772114] evm: security.SMACK64 [ 75.774224] evm: security.SMACK64EXEC [ 75.776412] evm: security.SMACK64TRANSMUTE [ 75.778886] evm: security.SMACK64MMAP [ 75.781294] evm: security.apparmor [ 75.783380] evm: security.ima [ 75.785276] evm: security.capability [ 75.787376] evm: HMAC attrs: 0x1 [ 75.787386] sd 0:0:0:0: [sda] Attached SCSI disk [ 75.789733] PM: Magic number: 2:399:893 [ 75.794464] pci_bus 0004:00: hash matches [ 75.797026] processor cpu45: hash matches [ 75.799534] memory memory1119: hash matches [ 75.802419] rtc_cmos 00:02: setting system clock to 2022-04-05T16:54:11 UTC (1649177651) [ 76.530760] nvme nvme0: 32/0/0 default/read/poll queues [ 76.546336] nvme nvme2: 32/0/0 default/read/poll queues [ 76.546529] nvme nvme1: 32/0/0 default/read/poll queues [ 76.561861] nvme nvme4: 32/0/0 default/read/poll queues [ 76.561868] nvme nvme3: 32/0/0 default/read/poll queues [ 76.577426] nvme nvme5: 32/0/0 default/read/poll queues [ 76.577681] nvme nvme6: 32/0/0 default/read/poll queues [ 76.593174] nvme nvme7: 32/0/0 default/read/poll queues [ 76.674283] Freeing unused decrypted memory: 2040K [ 76.678780] Freeing unused kernel image memory: 2496K [ 76.681395] Write protecting the kernel read-only data: 26624k [ 76.685609] Freeing unused kernel image memory: 2000K [ 76.688863] Freeing unused kernel image memory: 1408K [ 76.702057] x86/mm: Checked W+X mappings: passed, no W+X pages found. [ 76.705251] Run /init as init process [ 76.809976] hv_utils: Registering HyperV Utility Driver [ 76.816253] hv_vmbus: registering driver hv_utils [ 76.820038] hv_utils: Shutdown IC version 3.2 [ 76.823035] hv_utils: Heartbeat IC version 3.0 [ 76.825519] hv_utils: TimeSync IC version 4.0 [ 76.829602] hv_vmbus: registering driver hyperv_fb [ 76.829633] hidraw: raw HID events driver (C) Jiri Kosina [ 76.832767] checking generic (40000000 300000) vs hw (40000000 300000) [ 76.835794] fb0: switching to hyperv_fb from EFI VGA [ 76.838620] Console: switching to colour dummy device 80x25 [ 76.841326] hyperv_fb: Screen resolution: 1152x864, Color depth: 32 [ 76.845179] Console: switching to colour frame buffer device 144x54 [ 76.845335] hv_vmbus: registering driver hv_netvsc [ 76.852425] hv_vmbus: registering driver hyperv_keyboard [ 76.855257] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0004:00/VMBUS:00/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio0/input/input0 [ 76.863033] hv_vmbus: registering driver hid_hyperv [ 76.865800] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input1 [ 76.869933] hid 0006:045E:0621.0001: input: VIRTUAL HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on [ 76.874780] cryptd: max_cpu_qlen set to 1000 [ 76.884248] AVX2 version of gcm_enc/dec engaged. [ 76.885704] mlx5_core 0101:00:00.0: firmware version: 20.28.4000 [ 76.886825] AES CTR mode by8 optimization enabled [ 77.194847] mlx5_core 0102:00:00.0: firmware version: 20.28.4000 [ 77.490984] mlx5_core 0103:00:00.0: firmware version: 20.28.4000 [ 77.788648] mlx5_core 0104:00:00.0: firmware version: 20.28.4000 [ 78.086568] mlx5_core 0105:00:00.0: firmware version: 20.28.4000 [ 78.210268] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: VF slot 1 added [ 78.214189] hv_pci 4bcd3f84-7657-4dfc-a455-e8d9c0291426: PCI VMBus probing: Using version 0x10003 [ 78.223776] hv_pci 4bcd3f84-7657-4dfc-a455-e8d9c0291426: PCI host bridge to bus 7657:00 [ 78.227547] pci_bus 7657:00: root bus resource [mem 0xff0100000-0xff01fffff window] [ 78.232736] pci 7657:00:02.0: [15b3:1016] type 00 class 0x020000 [ 78.242811] pci 7657:00:02.0: reg 0x10: [mem 0xff0100000-0xff01fffff 64bit pref] [ 78.272826] pci 7657:00:02.0: enabling Extended Tags [ 78.281713] pci 7657:00:02.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown speed x0 link at 7657:00:02.0 (capable of 63.008 Gb/s with 8 GT/s x8 link) [ 78.293599] pci 7657:00:02.0: BAR 0: assigned [mem 0xff0100000-0xff01fffff 64bit pref] [ 78.303844] mlx5_core 7657:00:02.0: firmware version: 14.30.1210 [ 78.313237] mlx5_core 7657:00:02.0: handle_hca_cap:551:(pid 686): log_max_qp value in current profile is 18, changing it to HCA capability limit (12) [ 78.384490] mlx5_core 0106:00:00.0: firmware version: 20.28.4000 [ 78.701673] mlx5_core 0107:00:00.0: firmware version: 20.28.4000 [ 78.998568] mlx5_core 0108:00:00.0: firmware version: 20.28.4000 [ 79.297202] mlx5_core 7657:00:02.0: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0) [ 79.476240] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: VF registering: eth1 [ 79.480481] mlx5_core 7657:00:02.0 eth1: joined to eth0 [ 79.483461] mlx5_core 7657:00:02.0 eth1: Disabling LRO, not supported in legacy RQ [ 79.495484] mlx5_core 7657:00:02.0 eth1: Disabling LRO, not supported in legacy RQ [ 79.500087] mlx5_core 7657:00:02.0 enP30295s1: renamed from eth1 [ 79.504322] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0 [ 80.708821] raid6: avx2x4 gen() 29177 MB/s [ 80.756819] raid6: avx2x4 xor() 14440 MB/s [ 80.804819] raid6: avx2x2 gen() 30067 MB/s [ 80.852818] raid6: avx2x2 xor() 18394 MB/s [ 80.900823] raid6: avx2x1 gen() 17277 MB/s [ 80.948821] raid6: avx2x1 xor() 15411 MB/s [ 80.996820] raid6: sse2x4 gen() 15107 MB/s [ 81.044820] raid6: sse2x4 xor() 8727 MB/s [ 81.092821] raid6: sse2x2 gen() 14898 MB/s [ 81.140819] raid6: sse2x2 xor() 9430 MB/s [ 81.188819] raid6: sse2x1 gen() 7167 MB/s [ 81.236818] raid6: sse2x1 xor() 7736 MB/s [ 81.239306] raid6: using algorithm avx2x2 gen() 30067 MB/s [ 81.242263] raid6: .... xor() 18394 MB/s, rmw enabled [ 81.245025] raid6: using avx2x2 recovery algorithm [ 81.249288] xor: automatically using best checksumming function avx [ 81.254615] async_tx: api initialized (async) [ 81.304657] Btrfs loaded, crc32c=crc32c-intel [ 81.428125] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) [ 83.721777] systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid) [ 83.732357] systemd[1]: Detected virtualization microsoft. [ 83.735669] systemd[1]: Detected architecture x86-64. [ 83.796607] systemd[1]: Set hostname to . [ 83.845887] systemd[1]: Initializing machine ID from random generator. [ 83.849317] systemd[1]: Installed transient /etc/machine-id file. [ 85.187369] systemd[1]: Unnecessary job for sys-devices-virtual-misc-vmbus\x21hv_fcopy.device was removed. [ 85.192403] systemd[1]: Unnecessary job for sys-devices-virtual-misc-vmbus\x21hv_vss.device was removed. [ 85.197544] systemd[1]: Reached target Swap. [ 85.201989] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point. [ 85.269489] EXT4-fs (sda1): re-mounted. Opts: discard [ 85.307801] Loading iSCSI transport class v2.0-870. [ 85.354016] iscsi: registered transport (tcp) [ 85.387981] RPC: Registered named UNIX socket transport module. [ 85.391347] RPC: Registered udp transport module. [ 85.394141] RPC: Registered tcp transport module. [ 85.396785] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 85.444778] systemd-journald[1391]: Received request to flush runtime journal from PID 1 [ 85.449955] iscsi: registered transport (iser) [ 86.064179] hv_vmbus: registering driver hv_balloon [ 86.064613] hv_balloon: Using Dynamic Memory protocol version 2.0 [ 86.114685] mlx5_core 7657:00:02.0 enP30295s1: Disabling LRO, not supported in legacy RQ [ 86.592473] hv_utils: KVP IC version 4.0 [ 87.298286] audit: type=1400 audit(1649177662.927:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=1876 comm="apparmor_parser" [ 87.310375] audit: type=1400 audit(1649177662.939:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=1874 comm="apparmor_parser" [ 87.310379] audit: type=1400 audit(1649177662.939:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-cgns" pid=1874 comm="apparmor_parser" [ 87.310381] audit: type=1400 audit(1649177662.939:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-mounting" pid=1874 comm="apparmor_parser" [ 87.310383] audit: type=1400 audit(1649177662.939:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-nesting" pid=1874 comm="apparmor_parser" [ 87.310563] audit: type=1400 audit(1649177662.939:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=1877 comm="apparmor_parser" [ 87.310566] audit: type=1400 audit(1649177662.939:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=1877 comm="apparmor_parser" [ 87.310568] audit: type=1400 audit(1649177662.939:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=1877 comm="apparmor_parser" [ 87.340696] audit: type=1400 audit(1649177662.967:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=1878 comm="apparmor_parser" [ 87.362943] audit: type=1400 audit(1649177662.991:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/sbin/dhclient" pid=1875 comm="apparmor_parser" [ 88.421758] nvidia: loading out-of-tree module taints kernel. [ 88.421769] nvidia: module license 'NVIDIA' taints kernel. [ 88.421770] Disabling lock debugging due to kernel taint [ 88.464882] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 88.475267] nvidia-nvlink: Nvlink Core is being initialized, major device number 238 [ 88.475336] nvidia-nvswitch: Probing device 0005:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 88.475660] nvidia-nvswitch 0005:00:00.0: can't derive routing for PCI INT A [ 88.475661] nvidia-nvswitch 0005:00:00.0: PCI INT A: no GSI [ 89.758915] nvidia-nvswitch0: using MSI [ 90.069151] nvidia-nvswitch: Probing device 0006:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 90.069351] nvidia-nvswitch 0006:00:00.0: can't derive routing for PCI INT A [ 90.069352] nvidia-nvswitch 0006:00:00.0: PCI INT A: no GSI [ 91.279989] nvidia-nvswitch1: using MSI [ 91.584500] nvidia-nvswitch: Probing device 0007:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 91.584706] nvidia-nvswitch 0007:00:00.0: can't derive routing for PCI INT A [ 91.584707] nvidia-nvswitch 0007:00:00.0: PCI INT A: no GSI [ 92.889301] nvidia-nvswitch2: using MSI [ 93.204792] nvidia-nvswitch: Probing device 0008:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 93.205006] nvidia-nvswitch 0008:00:00.0: can't derive routing for PCI INT A [ 93.205007] nvidia-nvswitch 0008:00:00.0: PCI INT A: no GSI [ 94.541322] nvidia-nvswitch3: using MSI [ 94.852930] nvidia-nvswitch: Probing device 0009:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 94.853136] nvidia-nvswitch 0009:00:00.0: can't derive routing for PCI INT A [ 94.853137] nvidia-nvswitch 0009:00:00.0: PCI INT A: no GSI [ 96.027073] UDF-fs: INFO Mounting volume 'UDF Volume', timestamp 2022/04/06 00:00 (1000) [ 96.213666] nvidia-nvswitch4: using MSI [ 96.527839] nvidia-nvswitch: Probing device 000a:00:00.0, Vendor Id = 0x10de, Device Id = 0x1af1, Class = 0x68000 [ 96.528032] nvidia-nvswitch 000a:00:00.0: can't derive routing for PCI INT A [ 96.528033] nvidia-nvswitch 000a:00:00.0: PCI INT A: no GSI [ 97.845519] nvidia-nvswitch5: using MSI [ 98.162696] nvidia 0001:00:00.0: can't derive routing for PCI INT A [ 98.162698] nvidia 0001:00:00.0: PCI INT A: no GSI [ 98.214230] nvidia 0002:00:00.0: can't derive routing for PCI INT A [ 98.214232] nvidia 0002:00:00.0: PCI INT A: no GSI [ 98.260614] nvidia 0003:00:00.0: can't derive routing for PCI INT A [ 98.260616] nvidia 0003:00:00.0: PCI INT A: no GSI [ 98.312934] nvidia 0004:00:00.0: can't derive routing for PCI INT A [ 98.312935] nvidia 0004:00:00.0: PCI INT A: no GSI [ 98.361160] nvidia 000b:00:00.0: can't derive routing for PCI INT A [ 98.361162] nvidia 000b:00:00.0: PCI INT A: no GSI [ 98.407849] nvidia 000c:00:00.0: can't derive routing for PCI INT A [ 98.407850] nvidia 000c:00:00.0: PCI INT A: no GSI [ 98.453516] nvidia 000d:00:00.0: can't derive routing for PCI INT A [ 98.453519] nvidia 000d:00:00.0: PCI INT A: no GSI [ 98.502160] nvidia 000e:00:00.0: can't derive routing for PCI INT A [ 98.502161] nvidia 000e:00:00.0: PCI INT A: no GSI [ 98.548922] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.57.02 Tue Jul 13 16:14:05 UTC 2021 [ 100.720369] mlx5_core 7657:00:02.0 enP30295s1: Link up [ 100.723317] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched to VF: enP30295s1 [ 100.725358] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 100.998620] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 100.998645] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 101.014145] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 101.014158] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 101.200683] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched from VF: enP30295s1 [ 101.563292] mlx5_core 7657:00:02.0 enP30295s1: Disabling LRO, not supported in legacy RQ [ 102.208621] mlx5_core 7657:00:02.0 enP30295s1: Link up [ 102.209932] hv_netvsc 0022487a-2f9c-0022-487a-2f9c0022487a eth0: Data path switched to VF: enP30295s1 [ 104.107899] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 104.107922] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 104.123433] IPv4: martian source 255.255.255.255 from 168.63.129.16, on dev eth0 [ 104.123448] ll header: 00000000: 00 22 48 7a 2f 9c 12 34 56 78 9a bc 08 00 [ 106.339018] EXT4-fs (sda1): resizing filesystem from 7836155 to 33526011 blocks [ 111.610942] EXT4-fs (sda1): resized filesystem to 33526011 [ 111.906443] sdb: sdb1 [ 113.006932] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 114.321971] bpfilter: Loaded bpfilter_umh pid 2703 [ 114.322204] Started bpfilter [ 114.330695] new mount options do not match the existing superblock, will be ignored [ 114.596669] nvidia-uvm: Loaded the UVM driver, major device number 236. [ 114.813570] nvidia-nvswitch0: open (major=237) [ 114.814395] nvidia-nvswitch1: open (major=237) [ 114.815041] nvidia-nvswitch2: open (major=237) [ 114.815678] nvidia-nvswitch3: open (major=237) [ 114.816312] nvidia-nvswitch4: open (major=237) [ 114.816987] nvidia-nvswitch5: open (major=237) [ 118.497718] aufs 5.4.3-20200302 [ 127.337669] nvidia-nvlink: nvlink driver open [ 127.337673] nvidia-nvlink: nvlink driver close [ 127.337674] nvidia-nvlink: nvlink driver open [ 134.170252] hv_balloon: Max. dynamic memory size: 921600 MB [ 137.522299] nvidia-nvswitch0: open (major=237) [ 137.522307] nvidia-nvswitch0: open (major=237) [ 137.522310] nvidia-nvswitch1: open (major=237) [ 137.553546] nvidia-nvswitch1: open (major=237) [ 137.553551] nvidia-nvswitch2: open (major=237) [ 137.553554] nvidia-nvswitch2: open (major=237) [ 137.553557] nvidia-nvswitch3: open (major=237) [ 137.553559] nvidia-nvswitch3: open (major=237) [ 137.553562] nvidia-nvswitch4: open (major=237) [ 137.553565] nvidia-nvswitch4: open (major=237) [ 137.553567] nvidia-nvswitch5: open (major=237) [ 137.553570] nvidia-nvswitch5: open (major=237) [ 178.856173] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 178.858986] Bridge firewalling registered [ 180.034442] systemd[1]: Stopping Journal Service... [ 180.034497] systemd-journald[1391]: Received SIGTERM from PID 1 (systemd). [ 180.043291] systemd[1]: Stopped Journal Service. [ 180.046229] systemd[1]: Starting Journal Service... [ 180.063114] systemd[1]: Started Journal Service. [ 199.922850] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation [ 200.147592] IPv6: ADDRCONF(NETDEV_CHANGE): azved375880b3e2: link becomes ready [ 200.147876] IPv6: ADDRCONF(NETDEV_CHANGE): azved375880b3e: link becomes ready [ 200.148623] eth0: renamed from azved375880b3e2 [ 200.513690] kauditd_printk_skb: 4 callbacks suppressed [ 200.513691] audit: type=1400 audit(1649177776.139:16): apparmor="STATUS" operation="profile_load" profile="unconfined" name="cri-containerd.apparmor.d" pid=6852 comm="apparmor_parser" [ 200.514225] audit: type=1400 audit(1649177776.143:17): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="cri-containerd.apparmor.d" pid=6853 comm="apparmor_parser" [ 207.658221] IPv6: ADDRCONF(NETDEV_CHANGE): azv509f35fb0bc2: link becomes ready [ 207.658285] IPv6: ADDRCONF(NETDEV_CHANGE): azv509f35fb0bc: link becomes ready [ 207.659187] eth0: renamed from azv509f35fb0bc2 [ 207.946216] IPv6: ADDRCONF(NETDEV_CHANGE): azv42a008a89bc: link becomes ready [ 207.947113] eth0: renamed from azv42a008a89bc2 [ 208.078715] IPv6: ADDRCONF(NETDEV_CHANGE): azva99aa9e7048: link becomes ready [ 208.079546] eth0: renamed from azva99aa9e70482 [ 280.419599] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.57.02 Tue Jul 13 16:06:24 UTC 2021 [34749.365356] perf: interrupt took too long (2608 > 2500), lowering kernel.perf_event_max_sample_rate to 76500 [36507.207709] python3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 [36507.207712] CPU: 0 PID: 97420 Comm: python3 Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [36507.207713] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [36507.207714] Call Trace: [36507.207720] dump_stack+0x57/0x6d [36507.207724] dump_header+0x4f/0x200 [36507.207725] oom_kill_process+0xe6/0x120 [36507.207727] out_of_memory+0x117/0x540 [36507.207729] mem_cgroup_out_of_memory+0xbb/0xd0 [36507.207731] try_charge+0x762/0x7c0 [36507.207733] ? __alloc_pages_nodemask+0x153/0x320 [36507.207734] mem_cgroup_try_charge+0x75/0x190 [36507.207735] mem_cgroup_try_charge_delay+0x22/0x50 [36507.207738] __handle_mm_fault+0x943/0x1330 [36507.207739] handle_mm_fault+0xb7/0x200 [36507.207742] __do_page_fault+0x29c/0x4c0 [36507.207743] do_page_fault+0x35/0x110 [36507.207745] page_fault+0x39/0x40 [36507.207747] RIP: 0033:0x4b9692 [36507.207748] Code: 8d 50 ff 49 89 c8 4c 2b 05 9b 55 5b 00 48 8b 41 08 49 bd ab aa aa aa aa aa aa aa 49 c1 f8 04 4d 0f af c5 48 8d b8 00 10 00 00 40 24 ff ff 00 00 44 89 40 20 48 89 79 08 89 51 10 85 d2 0f 84 [36507.207749] RSP: 002b:00007ffebb0e0f50 EFLAGS: 00010a17 [36507.207751] RAX: 00007f6e42791000 RBX: 0000000000000002 RCX: 0000000002717610 [36507.207752] RDX: 0000000000000009 RSI: 0000000000000001 RDI: 00007f6e42792000 [36507.207752] RBP: 0000000000000142 R08: 0000000000000025 R09: 0000000000000008 [36507.207753] R10: 0000000000000001 R11: 0000000000000017 R12: 00007f6e42790fd0 [36507.207753] R13: aaaaaaaaaaaaaaab R14: 0000000000000004 R15: 00000000027a4110 [36507.207755] memory: usage 30720kB, limit 30720kB, failcnt 463 [36507.207755] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [36507.207756] kmem: usage 15696kB, limit 9007199254740988kB, failcnt 0 [36507.207756] Memory cgroup stats for /azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice: [36507.207766] anon 15114240 [36507.207766] file 0 [36507.207766] kernel_stack 0 [36507.207766] slab 10932224 [36507.207766] sock 0 [36507.207766] shmem 0 [36507.207766] file_mapped 0 [36507.207766] file_dirty 270336 [36507.207766] file_writeback 0 [36507.207766] anon_thp 0 [36507.207766] inactive_anon 0 [36507.207766] active_anon 15679488 [36507.207766] inactive_file 0 [36507.207766] active_file 159744 [36507.207766] unevictable 0 [36507.207766] slab_reclaimable 991232 [36507.207766] slab_unreclaimable 9940992 [36507.207766] pgfault 139623 [36507.207766] pgmajfault 0 [36507.207766] workingset_refault 132 [36507.207766] workingset_activate 0 [36507.207766] workingset_nodereclaim 0 [36507.207766] pgrefill 127 [36507.207766] pgscan 860 [36507.207766] pgsteal 835 [36507.207766] pgactivate 0 [36507.207766] pgdeactivate 127 [36507.207766] pglazyfree 0 [36507.207766] pglazyfreed 0 [36507.207766] thp_fault_alloc 0 [36507.207766] Tasks state (memory values in pages): [36507.207767] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [36507.207768] [ 97420] 0 97420 20527 6056 208896 0 0 python3 [36507.207769] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-3,oom_memcg=/azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice,task_memcg=/azure.slice/azure-walinuxagent.slice/azure-walinuxagent-logcollector.slice,task=python3,pid=97420,uid=0 [36507.207776] Memory cgroup out of memory: Killed process 97420 (python3) total-vm:82108kB, anon-rss:14956kB, file-rss:9268kB, shmem-rss:0kB, UID:0 pgtables:204kB oom_score_adj:0 [36507.216397] oom_reaper: reaped process 97420 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB [44517.301404] tritonserver invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=999 [44517.301407] CPU: 4 PID: 51660 Comm: tritonserver Tainted: P OE 5.4.0-1072-azure #75~18.04.1-Ubuntu [44517.301408] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 10/27/2020 [44517.301409] Call Trace: [44517.301415] dump_stack+0x57/0x6d [44517.301420] dump_header+0x4f/0x200 [44517.301422] oom_kill_process+0xe6/0x120 [44517.301423] out_of_memory+0x117/0x540 [44517.301426] mem_cgroup_out_of_memory+0xbb/0xd0 [44517.301427] try_charge+0x762/0x7c0 [44517.301431] ? blk_flush_plug_list+0xd1/0x100 [44517.301433] mem_cgroup_try_charge+0x75/0x190 [44517.301435] __add_to_page_cache_locked+0x21a/0x3d0 [44517.301437] ? scan_shadow_nodes+0x30/0x30 [44517.301438] add_to_page_cache_lru+0x4f/0xd0 [44517.301440] pagecache_get_page+0xea/0x2c0 [44517.301441] filemap_fault+0x669/0xb60 [44517.301442] ? unlock_page_memcg+0x12/0x20 [44517.301444] ? page_add_file_rmap+0x13a/0x180 [44517.301446] ? xas_load+0xc/0x80 [44517.301447] ? xas_find+0x16f/0x1b0 [44517.301448] ? filemap_map_pages+0x17d/0x3b0 [44517.301451] ext4_filemap_fault+0x31/0x50 [44517.301453] __do_fault+0x57/0x110 [44517.301455] __handle_mm_fault+0xdf1/0x1330 [44517.301457] handle_mm_fault+0xb7/0x200 [44517.301460] __do_page_fault+0x29c/0x4c0 [44517.301461] do_page_fault+0x35/0x110 [44517.301463] page_fault+0x39/0x40 [44517.301465] RIP: 0033:0x7f11b3d37c88 [44517.301469] Code: Bad RIP value. [44517.301470] RSP: 002b:00007ffcaf6fae90 EFLAGS: 00010216 [44517.301471] RAX: 0000000000000000 RBX: 00005560fd5324f0 RCX: 00005560fd531510 [44517.301472] RDX: 0000000000000018 RSI: 0000000000000084 RDI: 00005560fd5324f0 [44517.301473] RBP: 00005560fd5325f8 R08: 00005560fd531520 R09: 00005560fd2d12c0 [44517.301473] R10: ffffffffffffffff R11: 00007f11ea0b4be0 R12: 0000000000000008 [44517.301474] R13: 0000000000200000 R14: 00005560fd2db440 R15: 0000000000000000 [44517.301476] memory: usage 131072kB, limit 131072kB, failcnt 15691 [44517.301477] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [44517.301477] kmem: usage 19932kB, limit 9007199254740988kB, failcnt 0 [44517.301478] Memory cgroup stats for /kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909: [44517.301490] anon 110723072 [44517.301490] file 4108288 [44517.301490] kernel_stack 737280 [44517.301490] slab 11878400 [44517.301490] sock 0 [44517.301490] shmem 4190208 [44517.301490] file_mapped 4190208 [44517.301490] file_dirty 0 [44517.301490] file_writeback 0 [44517.301490] anon_thp 18874368 [44517.301490] inactive_anon 4190208 [44517.301490] active_anon 111054848 [44517.301490] inactive_file 0 [44517.301490] active_file 172032 [44517.301490] unevictable 0 [44517.301490] slab_reclaimable 2527232 [44517.301490] slab_unreclaimable 9351168 [44517.301490] pgfault 487938 [44517.301490] pgmajfault 429 [44517.301490] workingset_refault 14388 [44517.301490] workingset_activate 198 [44517.301490] workingset_nodereclaim 0 [44517.301490] pgrefill 14559 [44517.301490] pgscan 103484 [44517.301490] pgsteal 15359 [44517.301490] pgactivate 14190 [44517.301490] pgdeactivate 14376 [44517.301490] pglazyfree 0 [44517.301491] Tasks state (memory values in pages): [44517.301492] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [44517.301494] [ 8132] 65535 8132 241 1 28672 0 -998 pause [44517.301496] [ 11938] 0 11938 994 756 45056 0 999 bash [44517.301497] [ 50276] 0 50276 1060 892 53248 0 999 bash [44517.301499] [ 51500] 0 51500 627 148 40960 0 999 sleep [44517.301500] [ 51653] 0 51653 25429 2395 77824 0 999 mpirun [44517.301501] [ 51660] 0 51660 9879422 107763 1716224 0 999 tritonserver [44517.301502] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,mems_allowed=0-3,oom_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909,task_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909/92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,task=tritonserver,pid=51660,uid=0 [44517.301562] Memory cgroup out of memory: Killed process 51660 (tritonserver) total-vm:39517688kB, anon-rss:102208kB, file-rss:324748kB, shmem-rss:4096kB, UID:0 pgtables:1676kB oom_score_adj:999 [44517.317788] oom_reaper: reaped process 51660 (tritonserver), now anon-rss:0kB, file-rss:71760kB, shmem-rss:4096kB ```

tanmayv25 commented 2 years ago

Looks like your system is going out of memory.

[45019.521686] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[45019.521688] [   7857] 65535  7857      241        1    28672        0          -998 pause
[45019.521690] [   9563]     0  9563      974      720    49152        0           999 bash
[45019.521691] [  60753]     0 60753     1060      887    45056        0           999 bash
[45019.521692] [  60919]     0 60919      630      147    40960        0           999 sleep
[45019.521694] [  60992]     0 60992    25458     2426    77824        0           999 mpirun
[45019.521695] [  60997]     0 60997  9695196    83217  1458176        0           999 tritonserver
[45019.521695] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,mems_allowed=0-3,oom_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce,task_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce/a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,task=tritonserver,pid=60997,uid=0
[45019.521714] Memory cgroup out of memory: Killed process 60997 (tritonserver) total-vm:38780784kB, anon-rss:96564kB, file-rss:236304kB, shmem-rss:0kB, UID:0 pgtables:1424kB oom_score_adj:999
[45019.542213] oom_reaper: reaped process 60997 (tritonserver), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Try specifying lower --pinned-memory-pool-byte-size? With an empty model repository no models are being actually loaded. I wonder where the extra memory is being spent. @GuanLuo Do you have any idea?

GuanLuo commented 2 years ago

I don't know why it may cause OOM even when no model is loaded. Would it also worth a try to remove the backend shared libraries so Triton will start without loading any framework libraries?

shimoshida commented 2 years ago

@tanmayv25 Is this command correct? The result is the same.

root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=20000000000
I0411 03:03:57.509330 46959 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509551 46959 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509568 46959 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509576 46959 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509585 46959 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509593 46959 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509602 46959 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0411 03:03:57.509610 46959 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0411 03:03:57.993612 46959 libtorch.cc:998] TRITONBACKEND_Initialize: pytorch
I0411 03:03:57.993646 46959 libtorch.cc:1008] Triton TRITONBACKEND API version: 1.4
I0411 03:03:57.993650 46959 libtorch.cc:1014] 'pytorch' TRITONBACKEND API version: 1.4
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sample-triton-only-747d5f564b-vqk6s exited on signal 9 (Killed).
--------------------------------------------------------------------------

dmesg

[468744.708634] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,mems_allowed=0-3,oom_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909,task_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909/92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,task=tritonserver,pid=59345,uid=0
[468744.708698] Memory cgroup out of memory: Killed process 59345 (tritonserver) total-vm:39297024kB, anon-rss:85992kB, file-rss:227752kB, shmem-rss:0kB, UID:0 pgtables:1364kB oom_score_adj:999
[468744.724455] oom_reaper: reaped process 59345 (tritonserver), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Would it also worth a try to remove the backend shared libraries so Triton will start without loading any framework libraries?

I'm sorry I cannot understand this. What should I do for investigation?

GuanLuo commented 2 years ago

So Triton will attempt to load some backend (framework) libraries when it starts (i.e. I0404 11:55:52.570841 69 tensorflow.cc:2169] TRITONBACKEND_Initialize: tensorflow), and if you remove all the backends shipped in /opt/tritonserver/backends/ (rm -r /opt/tritonserver/backends/*), then Triton will have to start without any backends:

I0408 23:50:11.814732 817 server.cc:576] 
+---------+------+--------+
| Backend | Path | Config |
+---------+------+--------+
+---------+------+--------+

We can narrow down the scope depending on whether Triton starts successfully.

shimoshida commented 2 years ago

@GuanLuo Thank you for the help! I have tried to delete all backend library, but OOM occurs(Triton version is 21.07).

root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# rm -r /opt/tritonserver/backends/*
root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# ls /opt/tritonserver/backends/
root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/
I0412 08:09:13.073400 57484 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073633 57484 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073650 57484 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073658 57484 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073666 57484 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073674 57484 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073687 57484 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0412 08:09:13.073695 57484 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sample-triton-only-747d5f564b-vqk6s exited on signal 9 (Killed).
--------------------------------------------------------------------------
root@sample-triton-only-747d5f564b-vqk6s:/opt/tritonserver# tritonserver --model-repository=/
I0412 08:09:30.787453 57495 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787684 57495 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787702 57495 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787715 57495 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787729 57495 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787742 57495 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787757 57495 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0412 08:09:30.787771 57495 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
Killed

dmesg

[573399.692350] Tasks state (memory values in pages):
[573399.692350] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[573399.692353] [   8132] 65535  8132      241        1    28672        0          -998 pause
[573399.692354] [  11938]     0 11938      994      756    45056        0           999 bash
[573399.692356] [  55564]     0 55564     1060      927    45056        0           999 bash
[573399.692357] [  56603]     0 56603      627      130    45056        0           999 sleep
[573399.692359] [  56667]     0 56667  8896554    61104   897024        0           999 tritonserver
[573399.692360] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,mems_allowed=0-3,oom_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909,task_memcg=/kubepods/burstable/pod45dbec92-0481-4b70-b999-158b7cf2f909/92205d550d71282b3f6d25fd1844c3d252b0ceee832ed33832d1e8a4ad0c8a76,task=tritonserver,pid=56667,uid=0
[573399.692432] Memory cgroup out of memory: Killed process 56667 (tritonserver) total-vm:35586216kB, anon-rss:77612kB, file-rss:158612kB, shmem-rss:8192kB, UID:0 pgtables:876kB oom_score_adj:999
[573399.704863] oom_reaper: reaped process 56667 (tritonserver), now anon-rss:0kB, file-rss:77084kB, shmem-rss:8192kB

shimoshida commented 2 years ago

The same results by 22.03

root@sample-triton-latest-74fccc696d-58tgs:/opt/tritonserver# rm -r /opt/tritonserver/backends/*
root@sample-triton-latest-74fccc696d-58tgs:/opt/tritonserver# ls /opt/tritonserver/backends/
root@sample-triton-latest-74fccc696d-58tgs:/opt/tritonserver# mpirun -n 1 --allow-run-as-root tritonserver --model-repository=/
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sample-triton-latest-74fccc696d-58tgs exited on signal 9 (Killed).
--------------------------------------------------------------------------
root@sample-triton-latest-74fccc696d-58tgs:/opt/tritonserver# tritonserver --model-repository=/
Killed

dmesg

[573621.401513] Tasks state (memory values in pages):
[573621.401514] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[573621.401521] [   7857] 65535  7857      241        1    28672        0          -998 pause
[573621.401527] [   9563]     0  9563      974      720    49152        0           999 bash
[573621.401534] [  60239]     0 60239     1060      887    49152        0           999 bash
[573621.401537] [  60753]     0 60753      630      130    45056        0           999 sleep
[573621.401539] [  60859]     0 60859  8735457    49696   569344        0           999 tritonserver
[573621.401548] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,mems_allowed=0-3,oom_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce,task_memcg=/kubepods/burstable/podcb040dc7-7546-49e2-ae77-320dc45ed2ce/a82e417e149bceb2f24a3c73f90fd97b033185a234846eff07c07c4e32ca59a0,task=tritonserver,pid=60859,uid=0
[573621.401589] Memory cgroup out of memory: Killed process 60859 (tritonserver) total-vm:34941828kB, anon-rss:11588kB, file-rss:104700kB, shmem-rss:82496kB, UID:0 pgtables:556kB oom_score_adj:999
[573621.411454] oom_reaper: reaped process 60859 (tritonserver), now anon-rss:0kB, file-rss:77084kB, shmem-rss:82496kB
[573621.453318] Cannot map memory with base addr 0x7f13be000000 and size of 0x10000 pages
[573621.453339] NVRM: failed to copy out ioctl data

tanmayv25 commented 2 years ago

Can you try these experiments and share your findings?

tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=N

, does it work for following runs? Run1: N = 0 Run2: N = 1000 Run3: N = 1000000

What is the total memory available in the system?

shimoshida commented 2 years ago

@tanmayv25

settings:

Triton version: 21.07

Machine spec:

memory: 900 GiB https://docs.microsoft.com/en-us/azure/virtual-machines/nda100-v4-series

top - 04:41:13 up 12 min,  0 users,  load average: 0.20, 0.81, 0.69
Tasks:   4 total,   1 running,   3 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 907082.2 total, 855742.9 free,   5092.4 used,  46246.9 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used. 897293.8 avail Mem

Delete Libraries:

root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# rm -r /opt/tritonserver/backends/*
root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# ls /opt/tritonserver/backends/
root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# ls /workspace
ls: cannot access '/workspace': No such file or directory

Logs

N=0

root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=0
I0414 04:38:54.287572 69 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287879 69 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287896 69 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287909 69 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287918 69 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287937 69 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287950 69 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0414 04:38:54.287964 69 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0414 04:38:56.477714 69 pinned_memory_manager.cc:244] Pinned memory pool disabled
I0414 04:38:56.495776 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0414 04:38:56.495790 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 1 with size 67108864
I0414 04:38:56.495794 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 2 with size 67108864
I0414 04:38:56.495797 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 3 with size 67108864
I0414 04:38:56.495809 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 4 with size 67108864
I0414 04:38:56.495817 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 5 with size 67108864
I0414 04:38:56.495822 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 6 with size 67108864
I0414 04:38:56.495827 69 cuda_memory_manager.cc:105] CUDA memory pool is created on device 7 with size 67108864
Killed

N=1000

root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=1000
I0414 04:40:07.772539 87 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772764 87 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772782 87 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772800 87 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772812 87 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772824 87 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772837 87 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0414 04:40:07.772846 87 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0414 04:40:08.074938 87 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f642b200000' with size 1000
I0414 04:40:08.090917 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0414 04:40:08.090930 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 1 with size 67108864
I0414 04:40:08.090933 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 2 with size 67108864
I0414 04:40:08.090936 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 3 with size 67108864
I0414 04:40:08.090940 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 4 with size 67108864
I0414 04:40:08.090954 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 5 with size 67108864
I0414 04:40:08.090957 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 6 with size 67108864
I0414 04:40:08.090963 87 cuda_memory_manager.cc:105] CUDA memory pool is created on device 7 with size 67108864
Killed

N= 1000000

root@sample-triton-only-747d5f564b-b8vdl:/opt/tritonserver# tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=1000000
I0414 04:40:46.878367 101 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878591 101 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878608 101 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878622 101 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878635 101 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878651 101 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878665 101 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0414 04:40:46.878678 101 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0414 04:40:47.172029 101 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fdde9200000' with size 1000000
I0414 04:40:47.188096 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0414 04:40:47.188110 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 1 with size 67108864
I0414 04:40:47.188114 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 2 with size 67108864
I0414 04:40:47.188126 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 3 with size 67108864
I0414 04:40:47.188129 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 4 with size 67108864
I0414 04:40:47.188136 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 5 with size 67108864
I0414 04:40:47.188140 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 6 with size 67108864
I0414 04:40:47.188143 101 cuda_memory_manager.cc:105] CUDA memory pool is created on device 7 with size 67108864
Killed

tanmayv25 commented 2 years ago

I see. One more experiment. Can you try this command?

tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=0 --cuda-memory-pool-byte-size=0:0 --cuda-memory-pool-byte-size=1:0 --cuda-memory-pool-byte-size=2:0 --cuda-memory-pool-byte-size=3:0 --cuda-memory-pool-byte-size=4:0 --cuda-memory-pool-byte-size=5:0 --cuda-memory-pool-byte-size=6:0 --cuda-memory-pool-byte-size=7:0

Do you still see the failure? Read more about --cuda-memory-pool-byte-size from here: https://github.com/triton-inference-server/server/blob/main/src/main.cc#L555

64 MB should not be a great deal for 40GB gpus and 900GB machine. Most likely it is an issue with your environment. Trying to narrow down the same with these experiments.

shimoshida commented 2 years ago

@tanmayv25 Thank you for the explanation! The result is as follows:

root@sample-triton-only-747d5f564b-h2wvt:/opt/tritonserver# rm -r /opt/tritonserver/backends/*
root@sample-triton-only-747d5f564b-h2wvt:/opt/tritonserver# ls /opt/tritonserver/backends/
root@sample-triton-only-747d5f564b-h2wvt:/opt/tritonserver# ls /workspace
ls: cannot access '/workspace': No such file or directory
root@sample-triton-only-747d5f564b-h2wvt:/opt/tritonserver# tritonserver --model-repository=/workspace --pinned-memory-pool-byte-size=0 --cuda-memory-pool-byte-size=0:0 --cuda-memory-pool-byte-size=1:0 --cuda-memory-pool-byte-size=2:0 --cuda-memory-pool-byte-size=3:0 --cuda-memory-pool-byte-size=4:0 --cuda-memory-pool-byte-size=5:0 --cuda-memory-pool-byte-size=6:0 --cuda-memory-pool-byte-size=7:0
I0415 04:00:35.583063 106 metrics.cc:290] Collecting metrics for GPU 0: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583346 106 metrics.cc:290] Collecting metrics for GPU 1: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583363 106 metrics.cc:290] Collecting metrics for GPU 2: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583376 106 metrics.cc:290] Collecting metrics for GPU 3: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583385 106 metrics.cc:290] Collecting metrics for GPU 4: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583404 106 metrics.cc:290] Collecting metrics for GPU 5: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583417 106 metrics.cc:290] Collecting metrics for GPU 6: NVIDIA A100-SXM4-40GB
I0415 04:00:35.583431 106 metrics.cc:290] Collecting metrics for GPU 7: NVIDIA A100-SXM4-40GB
I0415 04:00:36.749200 106 pinned_memory_manager.cc:244] Pinned memory pool disabled
I0415 04:00:36.765320 106 cuda_memory_manager.cc:115] CUDA memory pool disabled
Killed

tanmayv25 commented 2 years ago

Marking it as a bug and will investigate more into why triton is going OOM.

wenjianhn commented 1 year ago

[44517.301476] memory: usage 131072kB, limit 131072kB, failcnt 15691

Try adding more memory to your K8s Pod via deployment.yaml to avoid OOM.

dyastremsky commented 1 year ago

Thanks for the suggestion, Jian!

Closing due to inactivity. If you would like this issue reopened for follow-up, please let us know.

triton-inference-server / server