tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.42k stars 1.92k forks source link

Error: Kernel 'RotateWithOffset' not registered for backend 'tensorflow' #8335

Closed rodyherrera closed 2 months ago

rodyherrera commented 3 months ago

System information

Describe the current behavior I'm running the @tensorflow-models/handpose 0.1.0 model on Node.js v20.15.0 using @tensorflow/tfjs-node 4.20.0. My server doesn't have a GPU, so if I use the CPU as backend the model works but it takes 6 seconds to make a prediction, which is absurd. If I use 'tensorflow' as backend, the predictions are made quite fast, the change is dramatic, however I get the following error: "Error: Kernel 'RotateWithOffset' not registered for backend 'tensorflow'".

@tensorflow-models/handpose doesn't allow running the tensorflow backend, but cpu or gpu does. When I start my backend server, the same Tensorflow library throws the following message if I don't use tensorflow as backend:

Screenshot from 2024-07-18 17-48-42

The only way to avoid it is by changing the backend to tensorflow, and as the message says and as I mentioned above, yes, performance improves dramatically, but the model stops working, really...?.

Here is the full error: Screenshot from 2024-07-18 17-47-36

As you can see, at first the model makes predictions, and detects in this case, that there is no gesture because there is no "hand" in the images sent to it. However, when it receives a photo that DOES have a hand, that is where the error occurs.

Describe the expected behavior When using the 'cpu' backend, the model can make the prediction correctly, with the difference that the time it takes is ridiculous. Below is a screenshot of how the model's prediction is returned when, in this case, an image of an open hand is sent to it. Screenshot from 2024-07-18 17-52-51

The first detection it makes returns an object that has the 'gesture' key value 'Handpose::HandsOpen', basically with the 'cpu' backend it works as expected. But if I change the backend to 'tensorflow' it stops making predictions, throwing the message that was described above.

Standalone code to reproduce the issue

import sharp from 'sharp';
import tf from '@tensorflow/tfjs-node';
import handpose from '@tensorflow-models/handpose';

const OPEN_HAND_THRESHOLD = 2500;
const FINGERTIPS = [4, 8, 12, 16, 20];
const BASES = [0, 5, 9, 13, 17];

let handposeModel = null;
let tensorPool = [];

export const loadModel = async () => {
    console.log('Loading TensorFlow...');
    await tf.ready();
    console.log('Setting TensorFlow backend to tensorflow...');
    // Here the backend is established, in this case with 'tensorflow'
    // where the error occurs, if it is changed to 'cpu', it will take longer but it works.
    await tf.setBackend('tensorflow');
    console.log('Loading Handpose model...');
    handposeModel = await handpose.load({
        maxContinuousChecks: 1,
        detectionConfidence: 0.8,
        iouThreshold: 0.3,
        scoreThreshold: 0.75
    });
    console.log('Handpose model loaded successfully');
};

const isOpenHand = (landmarks) => {
    for(let i = 0; i < FINGERTIPS.length; i++){
        const [dx, dy, dz] = [0, 1, 2].map(j => landmarks[FINGERTIPS[i]][j] - landmarks[BASES[i]][j]);
        if (dx * dx + dy * dy + dz * dz < OPEN_HAND_THRESHOLD) return false;
    }
    return true;
};

export const getPredictions = async (tensor) => {
    if(!handposeModel){
        console.error('@services/handposeWorker.jcs - getPredictions: model not loaded.');
        return null;
    }
    console.log('@services/handposeWorker.cjs: handposeModel.estimateHands(...)...');
    const predictions = await handposeModel.estimateHands(tensor, { flipHorizontal: true });
    console.log('@services/handposeWorker.cjs: handposeModel.estimateHands(...) ok, predictions:', predictions);
    const gesture = detectGestures(predictions);
    console.log('@services/handposeWorker.cjs: detectGestures(...):', gesture);
    tensorPool.push(tensor);
    return gesture;
};

const detectGestures = (predictions) => {
    if (predictions.length === 0) return { gesture: 'Handpose::NoDetection' };
    for (const { landmarks } of predictions) {
        console.log('@services/handposeWorker.cjs: landmarks:', landmarks);
        if (isOpenHand(landmarks)) return { gesture: 'Handpose::HandsOpen' };
    }
    return { gesture: 'Handpose::Unrecognized' };
};

export const blobToTensor = async (blob) => {
    try{
        const buffer = await sharp(blob).raw().toBuffer();
        const { width, height } = await sharp(blob).metadata();
        const tensor = tf.tensor3d(new Uint8Array(buffer), [height, width, 3]);
        return tensor;
    }catch(error){
        console.log('@services/handposeWorker.cjs - blobToTensor:', error);
    }
};

Is there a way to fix this so I can use the model using tensorflow as backend as the library even recommends when starting the script when using cpu?

gaikwadrahul8 commented 2 months ago

Hi, @rodyherrera

I apologize for the delay in my response and as far I know @tensorflow-models/handpose model is not officially compatible with the tensorflow backend on Node.js environment that may be reason you're encountering this error message Error: Kernel 'RotateWithOffset' not registered for backend 'tensorflow'

If you don't want to use GPU for some reason I would say the safest and most reliable option for now is to use the CPU await tf.setBackend('cpu');with @tensorflow-models/handpose in Node.js. This ensures the model works as intended.

Could you please try to set IS_NODE to false before you load the model to avoid above message ?

tf.env().set('IS_NODE', false);

Thank you for your cooperation and patience.

github-actions[bot] commented 2 months ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 2 months ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 2 months ago

Are you satisfied with the resolution of your issue? Yes No