skrashevich / double-take

Unified UI and API for processing and training images for facial recognition.
https://hub.docker.com/r/skrashevich/double-take
MIT License
533 stars 26 forks source link

[BUG] JavaScript heap out of memory #599

Closed YBonline closed 2 months ago

YBonline commented 3 months ago

Describe the bug New to double take, after trying to set it up, after running for about a day, it silently stops working. Logs indicate JavaScript heap out of memory

Version of Double Take v1.13.12.0rc3

Hardware

Configuration

Double Take

Learn more at https://github.com/jakowenko/double-take/#configuration

mqtt: host: 10.10.22.11 username: doubletake password: doubletakePASS

frigate: url: http://10.10.22.11:5000 update_sub_labels: true

detectors: aiserver: url: http://10.10.22.11:32168 timeout: 15

Additional context [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] warn: unexpected ai.server data { success: false, error: 'An Error occurred during processing', err_trace: 'Traceback (most recent call last):\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/face.py", line 591, in _recognise_face\n' + ' det = self.detector.predictFromImage(pil_image, threshold)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./process.py", line 83, in predictFromImage\n' + ' pred = self.model(img, augment=False)[0]\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 136, in forward\n' + ' return self._forward_once(x, profile, visualize) # single-scale inference, train\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 159, in _forward_once\n' + ' x = m(x) # run\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, *kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 68, in forward\n' + ' y[..., 0:2] = (y[..., 0:2] 2 + self.grid[i]) self.stride[i] # xy\n' + 'RuntimeError: The size of tensor a (6) must match the size of tensor b (3) at non-singleton dimension 2\n', moduleId: 'FaceProcessing', moduleName: 'Face Processing', code: 500, command: 'recognize', requestId: '6e45d22a-ea86-4f97-8313-5876d251ff58', inferenceDevice: 'GPU', analysisRoundTripMs: 4746, processedBy: 'localhost', timestampUTC: 'Sun, 11 Aug 2024 19:57:28 GMT' } [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:28] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] warn: unexpected ai.server data { success: false, error: 'An Error occurred during processing', err_trace: 'Traceback (most recent call last):\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/face.py", line 591, in _recognise_face\n' + ' det = self.detector.predictFromImage(pil_image, threshold)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./process.py", line 83, in predictFromImage\n' + ' pred = self.model(img, augment=False)[0]\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 136, in forward\n' + ' return self._forward_once(x, profile, visualize) # single-scale inference, train\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 159, in _forward_once\n' + ' x = m(x) # run\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 68, in forward\n' + ' y[..., 0:2] = (y[..., 0:2] 2 + self.grid[i]) self.stride[i] # xy\n' + 'RuntimeError: The size of tensor a (32) must match the size of tensor b (24) at non-singleton dimension 2\n', moduleId: 'FaceProcessing', moduleName: 'Face Processing', code: 500, command: 'recognize', requestId: '2bc57990-d483-41c1-b5c4-08bc3d968252', inferenceDevice: 'GPU', analysisRoundTripMs: 4898, processedBy: 'localhost', timestampUTC: 'Sun, 11 Aug 2024 19:57:29 GMT' } [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] warn: unexpected ai.server data { success: false, error: 'An Error occurred during processing', err_trace: 'Traceback (most recent call last):\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/face.py", line 591, in _recognise_face\n' + ' det = self.detector.predictFromImage(pil_image, threshold)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./process.py", line 83, in predictFromImage\n' + ' pred = self.model(img, augment=False)[0]\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, *kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 136, in forward\n' + ' return self._forward_once(x, profile, visualize) # single-scale inference, train\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 159, in _forward_once\n' + ' x = m(x) # run\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 68, in forward\n' + ' y[..., 0:2] = (y[..., 0:2] 2 + self.grid[i]) self.stride[i] # xy\n' + 'RuntimeError: The size of tensor a (6) must match the size of tensor b (3) at non-singleton dimension 2\n', moduleId: 'FaceProcessing', moduleName: 'Face Processing', code: 500, command: 'recognize', requestId: '4f672769-ce8d-46fe-bd7f-2b5ba790e8ff', inferenceDevice: 'GPU', analysisRoundTripMs: 4863, processedBy: 'localhost', timestampUTC: 'Sun, 11 Aug 2024 19:57:29 GMT' } [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:29] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] warn: unexpected ai.server data { success: false, error: 'An Error occurred during processing', err_trace: 'Traceback (most recent call last):\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/face.py", line 591, in _recognise_face\n' + ' det = self.detector.predictFromImage(pil_image, threshold)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./process.py", line 83, in predictFromImage\n' + ' pred = self.model(img, augment=False)[0]\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 136, in forward\n' + ' return self._forward_once(x, profile, visualize) # single-scale inference, train\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 159, in _forward_once\n' + ' x = m(x) # run\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, *kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 68, in forward\n' + ' y[..., 0:2] = (y[..., 0:2] 2 + self.grid[i]) self.stride[i] # xy\n' + 'RuntimeError: The size of tensor a (8) must match the size of tensor b (3) at non-singleton dimension 2\n', moduleId: 'FaceProcessing', moduleName: 'Face Processing', code: 500, command: 'recognize', requestId: '242f678b-238f-44e4-98b8-c8a0d14d64ab', inferenceDevice: 'GPU', analysisRoundTripMs: 5246, processedBy: 'localhost', timestampUTC: 'Sun, 11 Aug 2024 19:57:30 GMT' } [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:30] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] warn: unexpected ai.server data { success: false, error: 'An Error occurred during processing', err_trace: 'Traceback (most recent call last):\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/face.py", line 591, in _recognise_face\n' + ' det = self.detector.predictFromImage(pil_image, threshold)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./process.py", line 83, in predictFromImage\n' + ' pred = self.model(img, augment=False)[0]\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(input, kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 136, in forward\n' + ' return self._forward_once(x, profile, visualize) # single-scale inference, train\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 159, in _forward_once\n' + ' x = m(x) # run\n' + ' File "/app/modules/FaceProcessing/bin/linux/python38/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl\n' + ' return forward_call(*input, *kwargs)\n' + ' File "/app/modules/FaceProcessing/intelligencelayer/./models/yolo.py", line 68, in forward\n' + ' y[..., 0:2] = (y[..., 0:2] 2 + self.grid[i]) * self.stride[i] # xy\n' + 'RuntimeError: The size of tensor a (16) must match the size of tensor b (12) at non-singleton dimension 2\n', moduleId: 'FaceProcessing', moduleName: 'Face Processing', code: 500, command: 'recognize', requestId: '2e67a940-542b-424f-ba9d-acac468e8796', inferenceDevice: 'GPU', analysisRoundTripMs: 4794, processedBy: 'localhost', timestampUTC: 'Sun, 11 Aug 2024 19:57:31 GMT' } [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] info: ai.server found no face in image [19:57:31] error: stream error: Request failed with status code 404

<--- Last few GCs --->

[60:0x53e6850] 48822162 ms: Mark-sweep 4042.5 (4133.7) -> 4034.5 (4143.8) MB, 2760.0 / 0.0 ms (average mu = 0.223, current mu = 0.166) allocation failure; scavenge might not succeed [60:0x53e6850] 48826987 ms: Mark-sweep 4041.0 (4144.3) -> 4038.3 (4144.8) MB, 4813.8 / 0.0 ms (average mu = 0.134, current mu = 0.002) allocation failure; scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory 1: 0xb9a330 node::Abort() [/usr/local/bin/node] 2: 0xaa07ee [/usr/local/bin/node] 3: 0xd71ed0 v8::Utils::ReportOOMFailure(v8::internal::Isolate, char const, bool) [/usr/local/bin/node] 4: 0xd72277 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate, char const, bool) [/usr/local/bin/node] 5: 0xf4f635 [/usr/local/bin/node] 6: 0xf61b0d v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node] 7: 0xf3c1fe v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node] 8: 0xf3d5c7 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node] 9: 0xf1db40 v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/usr/local/bin/node] 10: 0xf1510c v8::internal::FactoryBase::AllocateRawArray(int, v8::internal::AllocationType) [/usr/local/bin/node] 11: 0xf15285 v8::internal::FactoryBase::NewFixedArrayWithFiller(v8::internal::Handle, int, v8::internal::Handle, v8::internal::AllocationType) [/usr/local/bin/node] 12: 0x10c4fe2 [/usr/local/bin/node] 13: 0x10ce774 [/usr/local/bin/node] 14: 0x10fad08 [/usr/local/bin/node] 15: 0x11b3a8d v8::internal::JSArray::SetLength(v8::internal::Handle, unsigned int) [/usr/local/bin/node] 16: 0x10fc4c5 v8::internal::ArrayConstructInitializeElements(v8::internal::Handle, v8::internal::Arguments<(v8::internal::ArgumentsType)1>) [/usr/local/bin/node] 17: 0x12d3a5a v8::internal::Runtime_NewArray(int, unsigned long, v8::internal::Isolate*) [/usr/local/bin/node] 18: 0x1710739 [/usr/local/bin/node]

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 2 months ago

This issue was closed because it has been stalled for 5 days with no activity.