epicweb-dev / epic-stack

This is a Full Stack app starter with the foundational things setup and configured for you to hit the ground running on your next EPIC idea.
https://www.epicweb.dev/epic-stack
MIT License
4.33k stars 355 forks source link

PrismaClientRustPanicError following a POST request to our application's endpoint from different origin. #642

Closed maxjnq closed 5 months ago

maxjnq commented 6 months ago

Hello,

We're developing a straightforward waitlist tool designed to enable our users to integrate a signup form directly onto their own websites. Users embed a simple HTML code snippet that users can copy from our application, located at https://wt.ls/. Here the the HTML code snippet from our embed form.

<div class=container-__ID__ style="display: flex">
    <form
        autocomplete="off"
        class="form-__ID__"
        style="display: flex"
        action="__DOMAIN__/action/create-sub/__LISTID__"
        method="POST"
    >
        <input
            type="text"
            style="display: inline-flex"
            class="input-__ID__"
            name="email"
            type="email"
            placeholder="__PLACEHOLDER__"
            required
        />

        <button type="submit" class="button-__ID__">
            <div class="cta-__ID__" style="opacity: __OCTA__; display: block">
                __CTA__
            </div>
            <div class="error-__ID__" style="opacity: 0; display: none"></div>
            <div class="loading-__ID__" style="opacity: __OLOADING__">
                <div class="loading-bar-__ID__">
                    <div class="loading-pill-__ID__"></div>
                </div>
            </div>
        </button>
    </form>
</div>

The issue occurs when a POST request is sent from another origin (from the embed form above) to our Remix route /action/create-sub/$listId below:

import { parseWithZod } from '@conform-to/zod'
import { invariantResponse } from '@epic-web/invariant'
import {
    json,
    type ActionFunctionArgs,
    type HeadersFunction,
} from '@remix-run/node'
import { nanoid } from 'nanoid'
import { z } from 'zod'
import { EmailSchema } from '#app/components/design/public-form'
import { prisma } from '#app/utils/db.server'
import { getEnv } from '#app/utils/env.server'
import { checkHoneypot } from '#app/utils/honeypot.server'

export const headers: HeadersFunction = () => ({
    'Access-Control-Allow-Origin': '*',
    'Access-Control-Allow-Methods': 'POST',
    'Access-Control-Allow-Headers':
        'Content-Type, Authorization, X-Requested-With',
    'Access-Control-Allow-Credentials': 'true',
})

export async function action({ params, request }: ActionFunctionArgs) {
    const { listId } = params
    const formData = await request.formData()
    const domain = getEnv().DOMAIN
    const list = await prisma.list.findUnique({
        where: { id: listId },
        select: { id: true, signUpPage: { select: { redirectUrl: true } } },
    })

    invariantResponse(list, 'List not found', { status: 404 })
    checkHoneypot(formData)

    const submission = await parseWithZod(formData, {
        schema: EmailSchema.superRefine(async (data, ctx) => {
            const sub = await prisma.sub.findFirst({
                where: {
                    AND: [{ email: data.email }, { listId: list.id }],
                },
            })

            if (sub) {
                ctx.addIssue({
                    path: ['email'],
                    code: z.ZodIssueCode.custom,
                    message: 'Already subscribed',
                })
                return
            }
        }),
        async: true,
    })

    if (submission.status !== 'success') {
        console.log('error', submission.reply().error?.email)
        return json({
            res: {
                error: submission.reply().error?.email,
                redirect: undefined,
            },
        })
    }

    const { email } = submission.value

    const sub = await prisma.sub.create({
        data: {
            email,
            shortId: nanoid(8),
            list: { connect: { id: listId } },
        },
        select: {
            shortId: true,
        },
    })

    const redirect = list.signUpPage?.redirectUrl
        ? list.signUpPage?.redirectUrl
        : `${domain}/me/${sub.shortId}`

    return json({ res: { error: undefined, redirect } })
}

When a user attempts to sign up via one of our embedded forms on external websites, we encounter a CORS error, which is expected. However, during sub creation—a process that functions seamlessly throughout the rest of the application—we face a PrismaClientRustPanicError, causing the entire application to crash until we manually restart it.

Screenshot 2024-03-08 at 20 18 06

My questions:

As an aside, the Epic Stack is truly remarkable—thank you for the exceptional work!

kentcdodds commented 6 months ago

Hi @maxjnq,

We didn't experience this problem with the Remix Blues stack, even though our codebase remained unchanged. Could there be specific security measures in the Epic Stack that trigger Prisma to shut down following a request from a different origin?

I'm not sure how you could have unchanged code going from the blues stack to the epic stack. There's probably quite a bit of code that would need to change to support that move (least of all is moving from Postgres to SQLite).

Prisma doesn't know or care about network requests. There are no security measures in the Epic Stack that should affect this (other than the CORs issue I suppose). Cookies are also set to lax so you won't get those sent for cross origin requests like thsi.

Considering this issue arises solely on this particular stack, should we address it with Remix or Prisma support teams?

I expect this issue is with Prisma.

Does anyone have insights into what might be causing this problem?

I don't know what could cause this, but interestingly I saw this the first time yesterday with this run of the playwright tests: https://github.com/epicweb-dev/epic-stack/actions/runs/8196931121/job/22418066789#step:11:74

I ran the tests again and they passed fine: https://github.com/epicweb-dev/epic-stack/actions/runs/8196931121/job/22418213776#step:11:286

Especially considering the change was unrelated to Prisma, I figured this was a weird fluke with Prisma and didn't bother reporting it.

There does appear to be some odd Prisma situation. If you can narrow down the problem further and make a reproduction that will probably help identify the cause. I don't know what this error really means so you may get some help from the Prisma team to point you in the right direction.

As an aside, the Epic Stack is truly remarkable—thank you for the exceptional work!

I'm glad to hear it! I hope you can work out what's going on with this error!

maxjnq commented 6 months ago

Hi @kentcdodds,

I've been encountering an inconsistent issue where sometimes everything functions as expected, but at other times, it fails. Interestingly, a similar problem emerged today from another part of our application, which leads me to believe that it might be related to SQLite.

The issue starts with encountering a "PrismaClientKnownRequestError" followed by a "PrismaClientRustPanicError" when attempting to save some settings through a specific route action.

This sequence might provide us with more insight into the root cause of the problem.

The action in question

export async function action({ params, request }: ActionFunctionArgs) {
    await requireUserId(request)
    const listId = params.listId
    const formData = await unstable_parseMultipartFormData(
        request,
        unstable_createMemoryUploadHandler({ maxPartSize: MAX_SIZE }),
    )

    const submission = await parseWithZod(formData, {
        schema: GeneralDesignSchema.superRefine(async (data, ctx) => {
            if (data.icon && data.icon.size > MAX_SIZE) {
                ctx.addIssue({
                    path: ['icon'],
                    code: z.ZodIssueCode.custom,
                    message: 'Image size must be less than 3MB',
                })
                return
            }
        }).transform(async data => {
            const { icon, intent, ...settings } = data
            if (data.intent === 'delete')
                return { intent: 'delete', icon: undefined, ...settings }
            if (data.icon && data.icon.size <= 0) return z.NEVER
            if (data.icon) {
                return {
                    intent,
                    icon: {
                        contentType: data.icon.type,
                        blob: Buffer.from(await data.icon.arrayBuffer()),
                    },
                    ...settings,
                }
            } else {
                return { intent, icon: undefined, ...settings }
            }
        }),
        async: true,
    })

    if (submission.status !== 'success') {
        return json(
            { result: submission.reply() },
            { status: submission.status === 'error' ? 400 : 200 },
        )
    }

    const { icon, intent, ...settings } = submission.value

    if (intent === 'delete-image') {
        await prisma.listIcon.deleteMany({ where: { listId } })
        return json({ result: submission.reply() }, 200)
    }

    await prisma.$transaction(async $prisma => {
        icon && (await $prisma.listIcon.deleteMany({ where: { listId } }))
        await $prisma.list.update({
            where: { id: listId },
            data: {
                icon: { create: icon },
                settings: {
                    update: {
                        ...settings,
                    },
                },
            },
        })
    })

    return json({ result: submission.reply() }, 200)
}

The PrismaClientKnownRequestError

Screenshot 2024-03-10 at 16 57 36

The PrismaClientRustPanicError that comes right after.

Screenshot 2024-03-10 at 17 00 41

After noticing this, I began to delve into our fly.io logs and conducted further research on that particular log entry.

thread 'tokio-runtime-worker' panicked at libs/user-facing-errors/src/quaint.rs:1

This led me to a Prisma issue #22947, which provides substantial insights, including a repository to reproduce the issue. It directed me to another related issue about enhancing error messages when the SQLite database file is locked #10430, and eventually, it circled back to this pull request from 3 weeks ago.

Is all of this information useful? I'm starting to think that this might be uncovering a larger underlying issue with SQLite, or perhaps we've misconfigured something like LiteFS, causing the app to crash after errors like PrismaClientKnownRequestError and concurrent SQLite queries.

Additionally, we never encountered such issues when our project was built using the epic stack from the blue stack, possibly because we were using PostgreSQL back then?

I'm hopeful that resolving this issue will further improve the epic stack! :)

kentcdodds commented 6 months ago

Well, because I experienced this in GitHub CI, I think we can rule out fly and LiteFS.

I expect this is a SQLite + Prisma issue. It very well could be related to the WAL support. Maybe remove the changes in that pull request and see whether you can reproduce this?

maxjnq commented 6 months ago

@kentcdodds I've just realized that our project was initiated before the changes were implemented, referring to our LiteFS configuration file below. I've uncommented the relevant lines and am about to redeploy. I'll update you on whether the issue persists. If everything operates smoothly, what would you suggest as the next steps? Should I escalate this issue further on the Prisma side or simply close it? Is this issue something to be worried about for an app expected to handle potentially millions of signups from startups and product launches weekly?

Thank you so much for your proactive support on this matter.

# Documented example: https://github.com/superfly/litefs/blob/dec5a7353292068b830001bd2df4830e646f6a2f/cmd/litefs/etc/litefs.yml
fuse:
  # Required. This is the mount directory that applications will
  # use to access their SQLite databases.
  dir: '${LITEFS_DIR}'

data:
  # Path to internal data storage.
  dir: '/data/litefs'

proxy:
  # matches the internal_port in fly.toml
  addr: ':${INTERNAL_PORT}'
  target: 'localhost:${PORT}'
  db: '${DATABASE_FILENAME}'

# The lease section specifies how the cluster will be managed. We're using the
# "consul" lease type so that our application can dynamically change the primary.
#
# These environment variables will be available in your Fly.io application.
lease:
  type: 'consul'
  candidate: ${FLY_REGION == PRIMARY_REGION}
  promote: true
  advertise-url: 'http://${HOSTNAME}.vm.${FLY_APP_NAME}.internal:20202'

  consul:
    url: '${FLY_CONSUL_URL}'
    key: 'epic-stack-litefs/${FLY_APP_NAME}'

exec:
  - cmd: node ./other/setup-swap.js

  - cmd: npx prisma migrate deploy
    if-candidate: true

  # re-enable these when this is fixed: https://github.com/superfly/litefs/issues/425
  # # Set the journal mode for the database to WAL. This reduces concurrency deadlock issues
  # - cmd: sqlite3 $DATABASE_PATH "PRAGMA journal_mode = WAL;"
  #   if-candidate: true

  # # Set the journal mode for the cache to WAL. This reduces concurrency deadlock issues
  # - cmd: sqlite3 $CACHE_DATABASE_PATH "PRAGMA journal_mode = WAL;"
  #   if-candidate: true

  - cmd: npm start
maxjnq commented 6 months ago

NEW UPDATE

Currently, our app is fluctuating between completely shutting down with an unexpected server error and functioning flawlessly as I refresh the page and between db queries.

You can observe this behavior on our forms on this test site pool.day. When attempting to sign up, you'll encounter a brief "Cannot Fetch" message, followed shortly by a redirect to the success page as anticipated. If you refresh the success page, it alternates between displaying "Subscriber not found" and operating without issues.

It almost seems as if there are two separate instances of the database running simultaneously: one is entirely down, while the other operates normally.

Perhaps simply uncommenting those lines and pushing the update through GitHub actions for a redeploy wasn't entirely sufficient? However, the problem has been partially resolved, as the app no longer completely shuts down. Instead, it alternates between a functioning version and a non-functioning version.

maxjnq commented 5 months ago

Hi @kentcdodds,

Hope this is useful: I managed to reproduce the issue quite consistently with some rare exceptions, here is how.

On click of one button, I requestSubmit() 2 forms on two different routes that trigger their respective action function.

Route A is the parent, Route B a nested route.

In Route A I do:

const handleSubmit = async () => { signUpPageRef.current?.requestSubmit() // Form in nested Route B settingsRef.current?.requestSubmit() // Form in parent Route A }

Each actions open a prisma $transaction.

Action in Route A

export async function action({ params, request }: ActionFunctionArgs) {
    await requireUserId(request)
    const listId = params.listId
    const formData = await unstable_parseMultipartFormData(
        request,
        unstable_createMemoryUploadHandler({ maxPartSize: MAX_SIZE }),
    )

    const submission = await parseWithZod(formData, {
        schema: GeneralDesignSchema.superRefine(async (data, ctx) => {
            if (data.icon && data.icon.size > MAX_SIZE) {
                ctx.addIssue({
                    path: ['icon'],
                    code: z.ZodIssueCode.custom,
                    message: 'Image size must be less than 3MB',
                })
                return
            }
        }).transform(async data => {
            const { icon, intent, ...settings } = data
            if (data.intent === 'delete')
                return { intent: 'delete', icon: undefined, ...settings }
            if (data.icon && data.icon.size <= 0) return z.NEVER
            if (data.icon) {
                return {
                    intent,
                    icon: {
                        contentType: data.icon.type,
                        blob: Buffer.from(await data.icon.arrayBuffer()),
                    },
                    ...settings,
                }
            } else {
                return { intent, icon: undefined, ...settings }
            }
        }),
        async: true,
    })

    if (submission.status !== 'success') {
        return json(
            { result: submission.reply() },
            { status: submission.status === 'error' ? 400 : 200 },
        )
    }

    const { icon, intent, ...settings } = submission.value

    if (intent === 'delete-image') {
        await prisma.listIcon.deleteMany({ where: { listId } })
        return json({ result: submission.reply() }, 200)
    }

    await prisma.$transaction(async $prisma => {
        icon && (await $prisma.listIcon.deleteMany({ where: { listId } }))
        await $prisma.list.update({
            where: { id: listId },
            data: {
                icon: { create: icon },
                settings: {
                    update: {
                        ...settings,
                    },
                },
            },
        })
    })

    return json({ result: submission.reply() }, 200)
}

Action in Route B

export async function action({ params, request }: ActionFunctionArgs) {
    invariant(params.listId, 'No lists.')
    const { listId } = params
    const formData = await unstable_parseMultipartFormData(
        request,
        unstable_createMemoryUploadHandler({ maxPartSize: MAX_SIZE }),
    )

    const submission = await parseWithZod(formData, {
        schema: SignupPageSchema.superRefine(async (data, ctx) => {
            if (data.ogImage && data.ogImage.size > MAX_SIZE) {
                ctx.addIssue({
                    path: ['ogImage'],
                    code: z.ZodIssueCode.custom,
                    message: 'Image size must be less than 3MB',
                })
                return
            }
        }).transform(async data => {
            const { ogImage, intent, ...settings } = data
            if (data.intent === 'delete')
                return { intent: 'delete', ogImage: undefined, ...settings }
            if (data.ogImage && data.ogImage.size <= 0) return z.NEVER
            if (data.ogImage) {
                return {
                    intent,
                    ogImage: {
                        contentType: data.ogImage.type,
                        blob: Buffer.from(await data.ogImage.arrayBuffer()),
                    },
                    ...settings,
                }
            } else {
                return { intent, ogImage: undefined, ...settings }
            }
        }),
        async: true,
    })

    if (submission.status !== 'success') {
        return json(
            { result: submission.reply() },
            { status: submission.status === 'error' ? 400 : 200 },
        )
    }

    const { ogImage, intent, ...settings } = submission.value

    if (intent === 'delete-image') {
        await prisma.listGraphQLImage.deleteMany({ where: { listId } })
        return json({ result: submission.reply() }, 200)
    }

    await prisma.$transaction(async $prisma => {
        ogImage &&
            (await $prisma.listGraphQLImage.deleteMany({ where: { listId } }))
        await $prisma.list.update({
            where: { id: listId },
            data: {
                ogImage: { create: ogImage },
                signUpPage: {
                    update: {
                        ...settings,
                    },
                },
            },
        })
    })

    await prisma.list.update({
        where: { id: listId },
        data: { signUpPage: { update: { ...settings } } },
    })

    return json({ result: submission.reply() }, 200)
}

The error I get in Route B.

Screenshot 2024-03-18 at 20 14 46

I forgot to delete that part of my code after adding the image upload above.

await prisma.list.update({
    where: { id: listId },
    data: { signUpPage: { update: { ...settings } } },
})

But still concerning Prisma handles this with a panic error, and not sure even why because my second update is just unnecessary, but shouldn't break the entire app. (That happens only in production)

Another guess, having the two transactions from two routes, I now handle everything is one + deleted that extra update and everything works fine.

kentcdodds commented 5 months ago

Looks like this issue is tracked on Prisma's side here: https://github.com/prisma/prisma/issues/22947

kentcdodds commented 5 months ago

As this is likely not an issue with the epic stack specifically, I'm going to close this issue in favor of the one tracked on prisma's issue tracker.