umami-software / umami

Umami is a simple, fast, privacy-focused alternative to Google Analytics.
https://umami.is
MIT License
20.77k stars 3.93k forks source link

Umami frequently stops running #1561

Closed craigconstantine closed 1 year ago

craigconstantine commented 1 year ago

Hi all, hoping to get some hints about where I can start troubleshooting…

umami 1.38.0 and for a few weeks now, when I go to the web site (the nginx web server that rev proxies to Umami itself) I get a 500-series error. 'pm2' on the box reports it is running. A pm2 start umami --watch then brings it back to life.

craigconstantine commented 1 year ago

down again…

Craigs-iMac:~ craig$ curl -I https://__REDACTED__
HTTP/1.1 502 Bad Gateway
Server: nginx/1.18.0 (Ubuntu)
Date: Mon, 10 Oct 2022 18:40:06 GMT
Content-Type: text/html
Content-Length: 166
Connection: keep-alive

while pm2 shows…

Screen Shot 2022-10-10 at 14 41 06

some recent syslog entries (the last is about 20 minutes ago)…

Oct 10 15:26:23 ns postgres_ns[451392]: [7-1] 2022-10-10 15:26:23.215 UTC [451392] umami@umami ERROR:  duplicate key value violates unique constraint "session_session_uuid_key"
Oct 10 15:26:23 ns postgres_ns[451392]: [7-2] 2022-10-10 15:26:23.215 UTC [451392] umami@umami DETAIL:  Key (session_uuid)=(62d94d1d-a5ef-56a6-a40a-023498430e49) already exists.
Oct 10 15:26:23 ns postgres_ns[451392]: [7-3] 2022-10-10 15:26:23.215 UTC [451392] umami@umami STATEMENT:  INSERT INTO "public"."session" ("session_uuid","website_id","created_at","hostname","browser","os","device","screen","language","country") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10) RETURNING "public"."session"."session_id" /* traceparent=00-00-00-00 */
Oct 10 16:41:52 ns postgres_ns[452312]: [7-1] 2022-10-10 16:41:52.009 UTC [452312] umami@umami ERROR:  duplicate key value violates unique constraint "session_session_uuid_key"
Oct 10 16:41:52 ns postgres_ns[452312]: [7-2] 2022-10-10 16:41:52.009 UTC [452312] umami@umami DETAIL:  Key (session_uuid)=(9ec1e580-fdd8-5b5a-996d-86e5080efa55) already exists.
Oct 10 16:41:52 ns postgres_ns[452312]: [7-3] 2022-10-10 16:41:52.009 UTC [452312] umami@umami STATEMENT:  INSERT INTO "public"."session" ("session_uuid","website_id","created_at","hostname","browser","os","device","screen","language","country") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10) RETURNING "public"."session"."session_id" /* traceparent=00-00-00-00 */
Oct 10 16:49:40 ns postgres_ns[452311]: [7-1] 2022-10-10 16:49:40.604 UTC [452311] umami@umami ERROR:  duplicate key value violates unique constraint "session_session_uuid_key"
Oct 10 16:49:40 ns postgres_ns[452311]: [7-2] 2022-10-10 16:49:40.604 UTC [452311] umami@umami DETAIL:  Key (session_uuid)=(867c5904-75e7-52a5-9d3f-a7fd992802cc) already exists.
Oct 10 16:49:40 ns postgres_ns[452311]: [7-3] 2022-10-10 16:49:40.604 UTC [452311] umami@umami STATEMENT:  INSERT INTO "public"."session" ("session_uuid","website_id","created_at","hostname","browser","os","device","screen","language","country") VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10) RETURNING "public"."session"."session_id" /* traceparent=00-00-00-00 */
Oct 10 18:23:59 ns postgres_ns[455395]: [7-1] 2022-10-10 18:23:59.443 UTC [455395] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455155]: [7-1] 2022-10-10 18:23:59.443 UTC [455155] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455156]: [7-1] 2022-10-10 18:23:59.443 UTC [455156] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455356]: [7-1] 2022-10-10 18:23:59.444 UTC [455356] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455357]: [7-1] 2022-10-10 18:23:59.444 UTC [455357] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455355]: [7-1] 2022-10-10 18:23:59.446 UTC [455355] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[452498]: [7-1] 2022-10-10 18:23:59.447 UTC [452498] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[453940]: [7-1] 2022-10-10 18:23:59.449 UTC [453940] umami@umami LOG:  could not receive data from client: Connection reset by peer
Oct 10 18:23:59 ns postgres_ns[455312]: [7-1] 2022-10-10 18:23:59.456 UTC [455312] umami@umami LOG:  could not receive data from client: Connection reset by peer

pm2 start umami --watch gets it going again.

Craigs-iMac:~ craig$ curl -I https://__REDACTED__
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
Date: Mon, 10 Oct 2022 18:44:50 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1871
Connection: keep-alive
X-DNS-Prefetch-Control: on
X-Frame-Options: SAMEORIGIN
Content-Security-Policy: default-src 'self'; img-src *; script-src 'self' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; connect-src 'self' api.umami.is; frame-ancestors 'self';
X-Powered-By: Next.js
ETag: "4dl7tejdty1fz"
Vary: Accept-Encoding

Troubleshooting thoughts?

Where does umami log?

craigconstantine commented 1 year ago

nginx reporting 502 again, pm2 is… confused?

Screen Shot 2022-10-11 at 08 12 46
12:14UTC:root@ns:~# pm2 start umami --watch
[PM2] Applying action restartProcessId on app [umami](ids: [ 0 ])
[PM2] [umami](0) ✓
[PM2] Process successfully started

And then I have a Node process for umami again…

12:14UTC:root@ns:~# ps auxw | grep umami
root      469572 12.0  1.5 11174568 125380 ?     Sl   12:14   0:00 /usr/bin/node /root/umami/node_modules/.bin/next start
root      469586  0.0  0.0   9076  2348 pts/0    R+   12:14   0:00 grep umami
briancao commented 1 year ago

What version of Umami are you running? It looks like the app is having issues with writing/reading sessions from your DB, to the extent it's timing out and crashing.

Try running your app in DEV mode, pointed to the DB and see what errors you get.

craigconstantine commented 1 year ago

1.38.0

Try running your app in DEV mode, pointed to the DB and see what errors you get.

How do I do that? (or can you point me to instrux for that . . .)

briancao commented 1 year ago

Instructions can be found here. https://umami.is/docs/install

You can follow them through "Start the application" where instead you'll be running

yarn dev

craigconstantine commented 1 year ago

I'm seeing these…

error - PrismaClientValidationError:
Invalid `prisma.event.create()` invocation:

{
  data: {
    website_id: 2,
    session_id: 1088780,
    url: '/live.phtml?show_id=640',
+   event_name: String,
?   event_id?: Int,
?   created_at?: DateTime | null,
?   event_data?: {
?     create?: event_dataCreateWithoutEventInput | event_dataUncheckedCreateWithoutEventInput,
?     connectOrCreate?: event_dataCreateOrConnectWithoutEventInput,
?     connect?: event_dataWhereUniqueInput
?   }
  }
}

Argument event_name for data.event_name is missing.

Note: Lines with + are required, lines with ? are optional.

    at Document.validate (/root/umami/node_modules/@prisma/client/runtime/index.js:28329:20)
    at serializationFn (/root/umami/node_modules/@prisma/client/runtime/index.js:30929:19)
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._executeRequest (/root/umami/node_modules/@prisma/client/runtime/index.js:30936:31)
    at consumer (/root/umami/node_modules/@prisma/client/runtime/index.js:30862:23)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:51
    at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:29
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:22) {
  clientVersion: '4.3.1',
  page: '/api/collect'
}

Those look more like problems with how the Umami browser-side client is "hooked" into pages/application. Right? Those don't look like db errors…

craigconstantine commented 1 year ago

…but, at least with it running in dev like this, when it crashes, the last error on my terminal should be a great clue. :D

briancao commented 1 year ago

Something seems out of sync here, whether it's the umami script being loaded onto your site, or the way events are being collected, or the function isn't setup correctl --- but the current configuration doens't allow your app to talk to the database correctly.

What does your CSS look like for event-collection?

craigconstantine commented 1 year ago

CSS: I'm not sure (I'll look / ask / find out) I didn't do that part. It sure seems to be working — the "Realtime" in Umami is showing traffic/events like I'd expect…

Screen Shot 2022-10-11 at 14 17 03

Re crashes: It's not crashed yet running Dev. I've seen it run for many hours, even over a day, before crashing... so I may have to leave it in dev for a while. Before I have a crash-related error.

craigconstantine commented 1 year ago

Here's the end of the output when it crashed. It's not clear to me, but I think everything down through the page: '/api/collect' } is simply the most recent instance of that error… I was seeing that frequently for hours. So I think the <--- Last few GCs ---> begins the actual interesting output??

error - PrismaClientValidationError:
Invalid `prisma.event.create()` invocation:

{
  data: {
    website_id: 2,
    session_id: 1081146,
    url: '/live.phtml?show_id=640',
+   event_name: String,
?   event_id?: Int,
?   created_at?: DateTime | null,
?   event_data?: {
?     create?: event_dataCreateWithoutEventInput | event_dataUncheckedCreateWithoutEventInput,
?     connectOrCreate?: event_dataCreateOrConnectWithoutEventInput,
?     connect?: event_dataWhereUniqueInput
?   }
  }
}

Argument event_name for data.event_name is missing.

Note: Lines with + are required, lines with ? are optional.

    at Document.validate (/root/umami/node_modules/@prisma/client/runtime/index.js:28329:20)
    at serializationFn (/root/umami/node_modules/@prisma/client/runtime/index.js:30929:19)
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._executeRequest (/root/umami/node_modules/@prisma/client/runtime/index.js:30936:31)
    at consumer (/root/umami/node_modules/@prisma/client/runtime/index.js:30862:23)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:51
    at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:29
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:22) {
  clientVersion: '4.3.1',
  page: '/api/collect'
}

<--- Last few GCs --->

[473857:0x49313a0] 17473295 ms: Mark-sweep 1950.5 (2083.1) -> 1936.8 (2083.3) MB, 1217.5 / 0.3 ms  (average mu = 0.577, current mu = 0.505) task scavenge might not succeed
[473857:0x49313a0] 17476426 ms: Mark-sweep 1950.8 (2083.3) -> 1936.8 (2083.3) MB, 1230.1 / 0.2 ms  (average mu = 0.592, current mu = 0.607) task scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
 1: 0xa3ac10 node::Abort() [/usr/bin/node]
 2: 0x970199 node::FatalError(char const*, char const*) [/usr/bin/node]
 3: 0xbba58e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 4: 0xbba907 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 5: 0xd76b25  [/usr/bin/node]
 6: 0xda74be v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject, int) [/usr/bin/node]
 7: 0xdb34f6 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [/usr/bin/node]
 8: 0xd9f68f v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [/usr/bin/node]
 9: 0xd9f908 v8::internal::PageEvacuationTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) [/usr/bin/node]
10: 0xd921e9 v8::internal::ItemParallelJob::Run() [/usr/bin/node]
11: 0xdb5450 void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompactCollector*, v8::internal::ItemParallelJob*, v8::internal::MigrationObserver*, long) [/usr/bin/node]
12: 0xdb5cec v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [/usr/bin/node]
13: 0xdb5eb5 v8::internal::MarkCompactCollector::Evacuate() [/usr/bin/node]
14: 0xdc7eb1 v8::internal::MarkCompactCollector::CollectGarbage() [/usr/bin/node]
15: 0xd84178 v8::internal::Heap::MarkCompact() [/usr/bin/node]
16: 0xd85c68 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/bin/node]
17: 0xd890ac v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/bin/node]
18: 0xd4e87d v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/usr/bin/node]
19: 0xd48704 v8::internal::FactoryBase<v8::internal::Factory>::AllocateRawWithImmortalMap(int, v8::internal::AllocationType, v8::internal::Map, v8::internal::AllocationAlignment) [/usr/bin/node]
20: 0xd491a3 v8::internal::FactoryBase<v8::internal::Factory>::NewByteArray(int, v8::internal::AllocationType) [/usr/bin/node]
21: 0xcd71ea v8::internal::TranslationBuffer::CreateByteArray(v8::internal::Factory*) [/usr/bin/node]
22: 0x1a5d067 v8::internal::compiler::CodeGenerator::GenerateDeoptimizationData() [/usr/bin/node]
23: 0x1a5d757 v8::internal::compiler::CodeGenerator::FinalizeCode() [/usr/bin/node]
24: 0x1aef0bb v8::internal::compiler::PipelineImpl::FinalizeCode(bool) [/usr/bin/node]
25: 0x1af0187 v8::internal::compiler::PipelineCompilationJob::FinalizeJobImpl(v8::internal::Isolate*) [/usr/bin/node]
26: 0xc84790 v8::internal::OptimizedCompilationJob::FinalizeJob(v8::internal::Isolate*) [/usr/bin/node]
27: 0xc8d0bb v8::internal::Compiler::FinalizeOptimizedCompilationJob(v8::internal::OptimizedCompilationJob*, v8::internal::Isolate*) [/usr/bin/node]
28: 0xcaec8d v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions() [/usr/bin/node]
29: 0xd2dfff v8::internal::StackGuard::HandleInterrupts() [/usr/bin/node]
30: 0x109f7a3 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/usr/bin/node]
31: 0x1448f59  [/usr/bin/node]
Aborted
error Command failed with exit code 134.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
briancao commented 1 year ago

Very odd that garbage collection appears to be failing and causing an overflow.

I think if we solve the culprit as to why your events aren't collecting, we'll solve those side-effects. But we won't know that until we can see how you're collecting events, whether that be your CSS or custom collection through JS.

talthanas commented 1 year ago

Working with Craig on this. All sites, including highest volume one just have the basic JS include, such as: <script async defer data-website-id="9e25a911-1bd8-42cf-a455-22cfc92c0bcb" src="https://[redacted]/umami.js" data-cache="true"

On one low volume site, there the the following CSS classes on some links: class="umami--click--classical-andamento"

Example:

    <a
        class="umami--click--mao"
        href="https://mosaicartsonline.com/courses/author/126076" target="_mao"
    >Mosaic Arts Online courses</a>

On one higher volume site we were tracking continueal minutes of watched streaming video with:

setTimeout(function(){
    setInterval(function(){ umami.trackEvent("2022 NRCHA Snaffle Bit Futurity",'watching');}, 60000);
},60000);

The code that adds the "2022 NRCHA Snaffle Bit Futurity" comes from PHP code and should be enforced to be non empty, but worse case should pass "" and not null.

We have visitors who load a page and watch a streaming video while other info (stats, scores) are updated and will watch and leave the page open for many hours at a time. Thus the 1 event per minute tracking.

This code was removed while we were troubleshooting but because of the length of time users keep the pages open, it could easily have been a old, stale page that was still open.

briancao commented 1 year ago

On further inspection, it looks like you only have a 2gb machine?

[473857:0x49313a0] 17473295 ms: Mark-sweep 1950.5 (2083.1) -> 1936.8 (2083.3) MB

I suggest increasing the memory on your machine as well.

craigconstantine commented 1 year ago

By "machine" you mean the Node JS "machine" running the Umami code, right? …the system itself has plenty of memory to give it more. Any hints on how I tell "it" — would it be yarn? — how much memory it's supposed to work within…

And is there a way I can tell how much working space Node has available (how much heap is left)? That way I can keep an actual eye on the approach of the ceiling/heap-exhaustion/crash…

briancao commented 1 year ago

Correct, I meant that as in the container or process running NodeJS/Umami code.

I've been looking around and have not found any tools to help access or correct this problem. Only recommendations to set node's capacity with something like NODE_OPTIONS=--max-old-space-size=4096 .... but not a clear solution.

https://stackoverflow.com/questions/48387040/how-do-i-determine-the-correct-max-old-space-size-for-node-js/48392705#48392705

craigconstantine commented 1 year ago

I think the problem is that since we upgraded Umami, it either has a memory leak or it's behavior has changed… our use-cases/usage are consistent. Granted, we upgraded Ubuntu 20.04 to 22.04, Postgres 12 to 14, Node, …basically everything is updated. But now, Umami is running out of RAM.

So telling Node it can use more RAM, may just enable it to run longer before heap exhaustion — so that's the experiment I'm trying to run now: Double Node's heap size and see if it crashes after about twice as long.

Anyway… I'm currently trying to figure out how to tell pm2 to tell node to use that maxoldspacesize arg . . .

craigconstantine commented 1 year ago

Actually, pm2 is currently telling me that the heap size is insanely tiny; 'pm2 show umami' is reporting 24 MB...

Screen Shot 2022-10-15 at 09 59 00
craigconstantine commented 1 year ago

I'm def not understanding how this all works…

14:05UTC:root@ns:~# NODE_ENV=--max-old-space-size=2048 pm2 restart umami --update-env
[PM2] Applying action restartProcessId on app [umami](ids: [ 0 ])
[PM2] [umami](0) ✓

that arg then shows up in pm2 show umami

Screen Shot 2022-10-15 at 10 14 26

But it hasn't increased the heap size that I can see. That --max-old-space-size=2048 seems like a cmd arg to the Node (or is it Yarn?) executable; rather than something that should end up set in Node's runtime environment?

Zooming out: "here's how to set sufficient heap space to run Umami…" seems like a really basic thing for the Install instructions, no?

What am I mis-understanding here... ??

craigconstantine commented 1 year ago

When I try the same steps as above but with NODE_OPTIONS

14:13UTC:root@ns:~# NODE_OPTIONS=--max-old-space-size=2048 pm2 restart umami --update-env
[PM2] Applying action restartProcessId on app [umami](ids: [ 0 ])
[PM2] [umami](0) ✓

I don't see anything all different in the output of pm2 show umami. Oddly, the first time I pm2 show umami it reports 48Mb of heap, but then subsequent pm2 show umami reports 24Mb of heap.

craigconstantine commented 1 year ago

Just realized, after more reading, that I can pass an arg directly to Node as pm2 starts it up

pm2 start umami --node-args="--max-old-space-size=2048"

Which then shows as interpreter args

Screen Shot 2022-10-15 at 11 37 36

…but pm2 show umami continues to report a ~24MB head size. ¯\_(ツ)_/¯

briancao commented 1 year ago

Didn't mean to send you down a wild goose chase. The pm2 heap size will grow as there is more usage on the app being monitored.

I'll be looking into Umami's memory usage, as it relates to getting errant events to see if I can duplicate the issue.

craigconstantine commented 1 year ago

Crashed again yesterday evening. I think it might have run longer since my recent attempts to up the heap size... but certainly not 2G wroth of heap longer.

Now that I see where Umami is logging errors. I've cleared the error log and stop/started it up again just now. When it crashes again, I'll look in the error log before I restart it so I can see the crash.

craigconstantine commented 1 year ago

This continues crashing, but things are getting more confusing (at least to me). It's a little tricky to gather evidence as this is the production Umami... not mission-critical but we do want the stats it collects. So I need to gather details, then restart it.

Umami has stopped answering Nginx's reverse proxy requests. nginx shows 500-series errors.

Looking at pm2, however— pm2 is happy…

Screen Shot 2022-10-20 at 19 29 45

And…

Screen Shot 2022-10-20 at 19 30 17

Which suggests to me that it's at 28MB of heap space. That's insanely tiny. And you can see from the 2-up screenshot, I'm clearly trying to tell node.js to run with 2G of heap space.

Plenty of actual RAM on the system atm…

Screen Shot 2022-10-20 at 19 31 23

The error log from Umami is still open by the process—so Umami really appears to still be running…

Screen Shot 2022-10-20 at 19 34 16

I have ~2400 instances of errors like this…

PrismaClientValidationError:
Invalid `.Z.client.event.create()` invocation in
/root/umami/.next/server/chunks/769.js:996:105

  993         }
  994     };
  995 }
→ 996 return lib_prisma__WEBPACK_IMPORTED_MODULE_3__/* ["default"].client.event.create */ .Z.client.event.create({
        data: {
          website_id: 2,
          session_id: 1129956,
          url: '/live.phtml?show_id=640',
      +   event_name: String,
      ?   event_id?: Int,
      ?   created_at?: DateTime | null,
      ?   event_data?: {
      ?     create?: event_dataCreateWithoutEventInput | event_dataUncheckedCreateWithoutEventInput,
      ?     connectOrCreate?: event_dataCreateOrConnectWithoutEventInput,
      ?     connect?: event_dataWhereUniqueInput
      ?   }
        }
      })

Argument event_name for data.event_name is missing.

Note: Lines with + are required, lines with ? are optional.

    at Document.validate (/root/umami/node_modules/@prisma/client/runtime/index.js:28329:20)
    at serializationFn (/root/umami/node_modules/@prisma/client/runtime/index.js:30929:19)
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._executeRequest (/root/umami/node_modules/@prisma/client/runtime/index.js:30936:31)
    at consumer (/root/umami/node_modules/@prisma/client/runtime/index.js:30862:23)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:51
    at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:29
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:22) {
  clientVersion: '4.3.1'
}

…frankly, I don't think those are actually causing the problem. They're not the tailing-end errors in the file.

The last erros in the Umami error log are below. I've included the final (of 2400-ish) PrismaClientValidationError and then there are two PrismaClientKnownRequestError errors. And then the heap exhaustion error.

**PrismaClientValidationError:**
Invalid `.Z.client.event.create()` invocation in
/root/umami/.next/server/chunks/769.js:996:105

  993         }
  994     };
  995 }
→ 996 return lib_prisma__WEBPACK_IMPORTED_MODULE_3__/* ["default"].client.event.create */ .Z.client.event.create({
        data: {
          website_id: 2,
          session_id: 1122459,
          url: '/live.phtml?show_id=597',
      +   event_name: String,
      ?   event_id?: Int,
      ?   created_at?: DateTime | null,
      ?   event_data?: {
      ?     create?: event_dataCreateWithoutEventInput | event_dataUncheckedCreateWithoutEventInput,
      ?     connectOrCreate?: event_dataCreateOrConnectWithoutEventInput,
      ?     connect?: event_dataWhereUniqueInput
      ?   }
        }
      })

Argument event_name for data.event_name is missing.

Note: Lines with + are required, lines with ? are optional.

    at Document.validate (/root/umami/node_modules/@prisma/client/runtime/index.js:28329:20)
    at serializationFn (/root/umami/node_modules/@prisma/client/runtime/index.js:30929:19)
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._executeRequest (/root/umami/node_modules/@prisma/client/runtime/index.js:30936:31)
    at consumer (/root/umami/node_modules/@prisma/client/runtime/index.js:30862:23)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:51
    at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:29
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:22) {
  clientVersion: '4.3.1'
}
**PrismaClientKnownRequestError:**
Invalid `.Z.client.pageview.create()` invocation in
/root/umami/.next/server/chunks/769.js:1267:111

  1264     });
  1265 }
  1266 async function relationalQuery(website_id, { session_id , url , referrer  }) {
→ 1267     return lib_prisma__WEBPACK_IMPORTED_MODULE_3__/* ["default"].client.pageview.create */ .Z.client.pageview.create(
Failed to validate the query: `Unable to match input value to any allowed input type for the field. Parse errors: [Query parsing/validation error at `Mutation.createOnepageview.d
ata.pageviewCreateInput.session`: A value is required but not set., Query parsing/validation error at `Mutation.createOnepageview.data.pageviewUncheckedCreateInput.session_id`: A
 value is required but not set.]` at `Mutation.createOnepageview.data`
    at RequestHandler.handleRequestError (/root/umami/node_modules/@prisma/client/runtime/index.js:29909:13)
    at RequestHandler.request (/root/umami/node_modules/@prisma/client/runtime/index.js:29892:12)
    at async PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:16)
    at async __WEBPACK_DEFAULT_EXPORT__ (/root/umami/.next/server/pages/api/collect.js:218:9)
    at async Object.apiResolver (/root/umami/node_modules/next/dist/server/api-utils/node.js:184:9)
    at async NextNodeServer.runApi (/root/umami/node_modules/next/dist/server/next-server.js:403:9)
    at async Object.fn (/root/umami/node_modules/next/dist/server/base-server.js:493:37)
    at async Router.execute (/root/umami/node_modules/next/dist/server/router.js:222:36)
    at async NextNodeServer.run (/root/umami/node_modules/next/dist/server/base-server.js:612:29)
    at async NextNodeServer.handleRequest (/root/umami/node_modules/next/dist/server/base-server.js:311:20) {
  code: 'P2009',
  clientVersion: '4.3.1',
  meta: {
    query_validation_error: 'Unable to match input value to any allowed input type for the field. Parse errors: [Query parsing/validation error at `Mutation.createOnepageview.dat
a.pageviewCreateInput.session`: A value is required but not set., Query parsing/validation error at `Mutation.createOnepageview.data.pageviewUncheckedCreateInput.session_id`: A v
alue is required but not set.]',
    query_position: 'Mutation.createOnepageview.data'
  }
}
**PrismaClientKnownRequestError:**
Invalid `.Z.client.pageview.create()` invocation in
/root/umami/.next/server/chunks/769.js:1267:111

  1264     });
  1265 }
  1266 async function relationalQuery(website_id, { session_id , url , referrer  }) {
→ 1267     return lib_prisma__WEBPACK_IMPORTED_MODULE_3__/* ["default"].client.pageview.create */ .Z.client.pageview.create(
Failed to validate the query: `Unable to match input value to any allowed input type for the field. Parse errors: [Query parsing/validation error at `Mutation.createOnepageview.d
ata.pageviewCreateInput.session`: A value is required but not set., Query parsing/validation error at `Mutation.createOnepageview.data.pageviewUncheckedCreateInput.session_id`: A
 value is required but not set.]` at `Mutation.createOnepageview.data`
    at RequestHandler.handleRequestError (/root/umami/node_modules/@prisma/client/runtime/index.js:29909:13)
    at RequestHandler.request (/root/umami/node_modules/@prisma/client/runtime/index.js:29892:12)
    at async PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:16)
    at async __WEBPACK_DEFAULT_EXPORT__ (/root/umami/.next/server/pages/api/collect.js:218:9)
    at async Object.apiResolver (/root/umami/node_modules/next/dist/server/api-utils/node.js:184:9)
    at async NextNodeServer.runApi (/root/umami/node_modules/next/dist/server/next-server.js:403:9)
    at async Object.fn (/root/umami/node_modules/next/dist/server/base-server.js:493:37)
    at async Router.execute (/root/umami/node_modules/next/dist/server/router.js:222:36)
    at async NextNodeServer.run (/root/umami/node_modules/next/dist/server/base-server.js:612:29)
    at async NextNodeServer.handleRequest (/root/umami/node_modules/next/dist/server/base-server.js:311:20) {
  code: 'P2009',
  clientVersion: '4.3.1',
  meta: {
    query_validation_error: 'Unable to match input value to any allowed input type for the field. Parse errors: [Query parsing/validation error at `Mutation.createOnepageview.dat
a.pageviewCreateInput.session`: A value is required but not set., Query parsing/validation error at `Mutation.createOnepageview.data.pageviewUncheckedCreateInput.session_id`: A v
alue is required but not set.]',
    query_position: 'Mutation.createOnepageview.data'
  }
}

<--- Last few GCs --->

[720796:0x5cb63a0] 209476573 ms: Mark-sweep 2008.8 (2084.0) -> 1994.8 (2084.3) MB, 2181.3 / 0.2 ms  (average mu = 0.729, current mu = 0.752) task scavenge might not succeed
[720796:0x5cb63a0] 209485118 ms: Mark-sweep 2008.3 (2084.3) -> 1995.0 (2084.5) MB, 3054.7 / 0.2 ms  (average mu = 0.684, current mu = 0.643) task scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory
 1: 0xa3ac10 node::Abort() [/usr/bin/node]
 2: 0x970199 node::FatalError(char const*, char const*) [/usr/bin/node]
 3: 0xbba58e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 4: 0xbba907 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 5: 0xd76b25  [/usr/bin/node]
 6: 0xda74be v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject, int) [/usr/bin/node]
 7: 0xdb34f6 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [/usr/bin/node]
 8: 0xd9f68f v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [/usr/bin/node]
 9: 0xd9f908 v8::internal::PageEvacuationTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) [/usr/bin/node]
10: 0xd921e9 v8::internal::ItemParallelJob::Run() [/usr/bin/node]
11: 0xdb5450 void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompa
ctCollector*, v8::internal::ItemParallelJob*, v8::internal::MigrationObserver*, long) [/usr/bin/node]
12: 0xdb5cec v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [/usr/bin/node]
13: 0xdb5eb5 v8::internal::MarkCompactCollector::Evacuate() [/usr/bin/node]
14: 0xdc7eb1 v8::internal::MarkCompactCollector::CollectGarbage() [/usr/bin/node]
15: 0xd84178 v8::internal::Heap::MarkCompact() [/usr/bin/node]
16: 0xd85c68 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/bin/node]
17: 0xde4d7c v8::internal::ScavengeJob::Task::RunInternal() [/usr/bin/node]
18: 0xcae4eb non-virtual thunk to v8::internal::CancelableTask::Run() [/usr/bin/node]
19: 0xaa99d4 node::PerIsolatePlatformData::RunForegroundTask(std::unique_ptr<v8::Task, std::default_delete<v8::Task> >) [/usr/bin/node]
20: 0xaab839 node::PerIsolatePlatformData::FlushForegroundTasksInternal() [/usr/bin/node]
21: 0x13c0be6  [/usr/bin/node]
22: 0x13d3154  [/usr/bin/node]
23: 0x13c1538 uv_run [/usr/bin/node]
24: 0xa7b762 node::NodeMainInstance::Run() [/usr/bin/node]
25: 0xa03925 node::Start(int, char**) [/usr/bin/node]
26: 0x7fa230a29d90  [/lib/x86_64-linux-gnu/libc.so.6]
27: 0x7fa230a29e40 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
28: 0x98c58c  [/usr/bin/node]
Aborted
error Command failed with exit code 134.

Finally, I note that the error log begins with this. There's no timing telemetry, so I'm assuming those warn messages are at start up, and then one of my 2400-ish PrismaClientValidationError errors.

warn  - You are using a non-standard "NODE_ENV" value in your environment. This creates inconsistencies in the project and is strongly advised against. Read more: https://nextjs.
org/docs/messages/non-standard-node-env
warn  - You are using a non-standard "NODE_ENV" value in your environment. This creates inconsistencies in the project and is strongly advised against. Read more: https://nextjs.
org/docs/messages/non-standard-node-env
PrismaClientValidationError:
Invalid `.Z.client.event.create()` invocation in
/root/umami/.next/server/chunks/769.js:996:105

  993         }
  994     };
  995 }
→ 996 return lib_prisma__WEBPACK_IMPORTED_MODULE_3__/* ["default"].client.event.create */ .Z.client.event.create({
        data: {
          website_id: 2,
          session_id: 1122381,
          url: '/live.phtml?show_id=571',
      +   event_name: String,
      ?   event_id?: Int,
      ?   created_at?: DateTime | null,
      ?   event_data?: {
      ?     create?: event_dataCreateWithoutEventInput | event_dataUncheckedCreateWithoutEventInput,
      ?     connectOrCreate?: event_dataCreateOrConnectWithoutEventInput,
      ?     connect?: event_dataWhereUniqueInput
      ?   }
        }
      })

Argument event_name for data.event_name is missing.

Note: Lines with + are required, lines with ? are optional.

    at Document.validate (/root/umami/node_modules/@prisma/client/runtime/index.js:28329:20)
    at serializationFn (/root/umami/node_modules/@prisma/client/runtime/index.js:30929:19)
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._executeRequest (/root/umami/node_modules/@prisma/client/runtime/index.js:30936:31)
    at consumer (/root/umami/node_modules/@prisma/client/runtime/index.js:30862:23)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:51
    at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
    at /root/umami/node_modules/@prisma/client/runtime/index.js:30867:29
    at runInChildSpan (/root/umami/node_modules/@prisma/client/runtime/index.js:24157:12)
    at PrismaClient._request (/root/umami/node_modules/@prisma/client/runtime/index.js:30864:22) {
  clientVersion: '4.3.1'
}

So this really still feels like I'm not correctly instructing Node to allocate a reasonable amount of heap. So that remains my current ask:

How do I tell Node how much heap to use for this Umami instance?

craigconstantine commented 1 year ago

Crashed again. Same final errors in the log.

I reset the NODE_ENV to "production" (as per https://nextjs.org/docs/messages/non-standard-node-env ) via NODE_ENV=production pm2 restart umami --update-env. My ~root/.pm2/logs/umami-error.log now does not start with those warning about a non-standard NODE_ENV setting.

…still not seeing how to increase the heap size.

Screen Shot 2022-10-21 at 20 36 36
briancao commented 1 year ago

I have not been able to reproduce the memory issue yet. But it looks like you're continuing to receive bad prisma calls, which I imagine is the root cause of the issues. I was able to reproduce the issue by passing a null into the event tracker. I imagine the PHP you're using to send a string into the event tracker is sending a null as well.

code: window.umami.trackEvent(null, '');

result:

error - PrismaClientValidationError: 
Invalid `prisma.event.create()` invocation:

{
  data: {
    websiteId: 1,
    sessionId: 313,
    url: '/console/4cd57f38-e1e7-4a1c-b077-2e38272e918b',
+   eventName: String,
    eventUuid: 'c94fa0c3-9fa4-4039-aca7-6fb76187cf2c',
?   id?: Int,
?   createdAt?: DateTime | null,
?   eventData?: {
?     create?: eventDataCreateWithoutEventInput | eventDataUncheckedCreateWithoutEventInput,
?     connectOrCreate?: eventDataCreateOrConnectWithoutEventInput,
?     connect?: eventDataWhereUniqueInput
?   }
  }
}

I have yet to reproduce your pageview errors, but I imagine it's a similar issue.

talthanas commented 1 year ago

In order to rule out that issue. We had one single place that had a line of:

umami.trackEvent(<?= json_encode($show->show_name.'') ?>,'watching');

This was removed ~10 days ago in order to simplify the debugging process. While I would say it might be possible that someone has an older, cached page, that is a long time.

Second, there is code previously on the page that should not get to this branch without a valid sting for a show name.

Even if that happened, it should never sent null. It's doing a string cast so at worst you'd get: <?= json_encode(null.'') ?> which would print the empty string: "" so the JS would look like: umami.trackEvent("",'watching');

though I'm not even sure that is possible given the other code checks and requirements.

Either way, as I said we removed any calls to umami.trackEvent.

On Oct 24, 2022, at 14:56, Brian Cao @.***> wrote:

I have not been able to reproduce the memory issue yet. But it looks like you're continuing to receive bad prisma calls, which I imagine is the root cause of the issues. I was able to reproduce the issue by passing a null into the event tracker. I imagine the PHP you're using to send a string into the event tracker is sending a null as well.

code: window.umami.trackEvent(null, '');

result:

error - PrismaClientValidationError: Invalid prisma.event.create() invocation:

{ data: { websiteId: 1, sessionId: 313, url: '/console/4cd57f38-e1e7-4a1c-b077-2e38272e918b',

  • eventName: String, eventUuid: 'c94fa0c3-9fa4-4039-aca7-6fb76187cf2c', ? id?: Int, ? createdAt?: DateTime | null, ? eventData?: { ? create?: eventDataCreateWithoutEventInput | eventDataUncheckedCreateWithoutEventInput, ? connectOrCreate?: eventDataCreateOrConnectWithoutEventInput, ? connect?: eventDataWhereUniqueInput ? } } } I have yet to reproduce your pageview errors, but I imagine it's a similar issue.

— Reply to this email directly, view it on GitHub https://github.com/umami-software/umami/issues/1561#issuecomment-1289460542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMGYWA73W6KPK4O5WJGRCTWE3LWXANCNFSM6AAAAAARBLEF6M. You are receiving this because you commented.

👨🏼‍💻 Eric Schmoyer 🖋 https://twitter.com/ericschmoyer 📷 https://www.instagram.com/talthanas/ ♻️ This email is printed from 100% recycled electrons.

craigconstantine commented 1 year ago

Eric beat me to replying :)

Well, those "Invalid prisma.event.create()" errors are a bug in Umami. It seems to be an effective way to DOS Umami. Nefarious actors could intentionally set up their client-side j/s to cause this. In principle, no external [aka from the web-client side via j/s] input should crash a server. That needs to be filed and fixed.

Back to my larger callout (this entire discussion thread) about my Umami is crashing. I'm not convinced these errors are leading to heap exhaustion, but I'm down to try a release-candidate of Umami as a test. Wrap on some if-Null case handling to protect that prisma.event.create() flow and I'm happy to try it out.