tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.35k stars 1.92k forks source link

NextJS - Error: Kernel 'UnsortedSegmentSum' not registered for backend 'wasm' #8082

Open peacefulotter opened 9 months ago

peacefulotter commented 9 months ago

System information

Describe the current behavior Using the wasm backend fails and throws: Kernel 'UnsortedSegmentSum' not registered for backend 'wasm'.

Describe the expected behavior No error

Standalone code to reproduce the issue The reproducible code I can provide consists of a NextJS app, running gpt-tfjs. Code has been tested under the cpu, webgl and webgpu backend and for these, it works. Moreover, I could not reproduce the error with a small amount of code - I don't know where the error is coming from. The stacktrace indicates that the error originates from AdamOptimizer but my testings on a dummy model and dataset work using tf.adam.AdamW work..

Create a NextJS app and install tfjs

$ npx create-next-app
$ cd my-app
$ npm install @tensorflow/tfjs @tensorflow/tfjs-backend-wasm

Since training runs in the browser, create an API route that will serve the ASM files from the @tensorflow/tfjs-backend-wasm package:

// my-app/app/api/wasm/[file]/route.ts
export async function GET(req: NextRequest, { params }: any) {
    const { file } = params
    const p = path.resolve(
        process.cwd(),
        'node_modules',
        '@tensorflow',
        'tfjs-backend-wasm',
        'dist',
         file
    )
    const stream = makeReadableByteFileStream(p) // convert the file to a ReadableStream
    const res = new NextResponse(stream)
    res.headers.set('content-type', 'application/wasm')
    return res // return a NextResponse with the file as a ReadableStream body since this is what is expected from WebAssembly.instantiateStream
}

And now in the client:

// my-app/page/app.tsx
'use client'
import * as tf from '@tensorflow/tfjs'
import '@tensorflow/tfjs-backend-wasm'
import { setWasmPaths } from '@tensorflow/tfjs-backend-wasm'

import trainTest from '../train-test'

setWasmPaths(
    '/api/wasm/' // Points to the api route
)

export default function Home() {
    return (
        <button onClick={trainTest}>test</button>
    )
}

// my-app/train-test.ts
import * as tf from '@tensorflow/tfjs'
import { model } from '#/gpt-tfjs'
const { GPTLMHeadModel } = model

export default async function trainTest() {
    await tf.setBackend('wasm')
    await tf.ready()

    // Dummy dataset, always returning the same arrays
    async function* generator() {
        while (true) {
            console.log('in dataset')
            yield {
                x: [1, 2],
                y: [3, 4],
            }
        }
    }

    // Minimal config object for gpt-tfjs
    const config = {
        batchSize: 2,
        blockSize: 5,
        vocabSize: 3,
        modelType: 'gpt-nano',
        weigthDecay: false
    }

    const dataset = tf.data.generator(generator as any).map((v: any) => ({
        x: tf.tensor1d(v.x, 'int32'),
        y: tf.oneHot(v.y, config.vocabSize),
    }))

    // Instantiate the model and start training
    const gpt = GPTLMHeadModel(config)
    await gpt.train(dataset, config)
}

Other info / logs Include any logs or source code that would be helpful to If you wish I can include the code for makeReadableByteFileStream called in the api route. Here is the entire stacktrace:

ncaught (in promise) Error: Kernel 'UnsortedSegmentSum' not registered for backend 'wasm'
    at Engine.runKernel (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:419:19)
    at unsortedSegmentSum_ (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/ops/unsorted_segment_sum.js:55:56)
    at unsortedSegmentSum__op (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/ops/operation.js:51:28)
    at Object.eval [as x] (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/gradients/GatherV2_grad.js:58:111)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/tape.js:134:60)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:347:22)
    at Engine.scopedRun (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:357:25)
    at Engine.tidy (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:346:21)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:902:23)
    at backpropagateGradients (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/tape.js:134:24)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:900:74)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:347:22)
    at Engine.scopedRun (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:357:25)
    at Engine.tidy (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:346:21)
    at Engine.gradients (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:896:21)
    at variableGrads (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/gradients.js:265:74)
    at AdamOptimizer.computeGradients (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/optimizers/optimizer.js:90:73)
    at eval (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/src/train.js:64:41)
    at eval (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:347:22)
    at Engine.scopedRun (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:357:25)
    at Engine.tidy (webpack-internal:///(:3000/app-pages-browser)/./node_modules/@tensorflow/tfjs-core/dist/engine.js:346:21)
    at Module.tidy (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/node_modules/@tensorflow/tfjs-core/dist/globals.js:203:56)
    at train (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/src/train.js:63:12)
    at async GPTLMHeadModel_.train (webpack-internal:///(:3000/app-pages-browser)/../../gpt-tfjs/src/model.js:740:9)
    at async trainTest (webpack-internal:///(:3000/app-pages-browser)/./ml/train-test.ts:123:5)

Thank you very much, any help is appreciated, even the slightest! :pray:

gaikwadrahul8 commented 9 months ago

Hi, @peacefulotter

We appreciate your effort in highlighting this issue. Upon reviewing the error message, it appears that the UnsortedSegmentSum kernel functionality in the WebAssembly (wasm) backend of TensorFlow.js is presently not supported. As you noted, the code functions as expected on alternative backends such as CPU, WebGL, or WebGPU.

As of now, we will categorize this issue as a feature request to potentially include support for UnsortedSegmentSum within the wasm backend if possible. Thank you for bringing this to our attention.

peacefulotter commented 9 months ago

Hi @gaikwadrahul8 ,

Thanks for the response, hopefully this is supported soon!

Could you pin-point what calls this UnsortedSegmentSum kernel in my case? I would like to find a workaround.

Thanks again