tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.37k stars 1.92k forks source link

Add save() and load() methods to knn-classifier in tfjs-models #633

Closed dsmilkov closed 2 years ago

dsmilkov commented 6 years ago

We should add save() and load() methods to KnnClassifier. They can take the same url/path format as model.save().

See discussion for motivation and context.

Internally we can make an empty model with non-trainable weights and use the existing model.save() infrastructure to save it.

Long term, we should have a generic tf.save(), tf.load() method that can take an dict of tensor names to tensors and save them.

cc @tafsiri @caisq in case I missed something from our discussion.

oreHGA commented 6 years ago

Hi, I'd like to give this a shot!

hpssjellis commented 5 years ago

@oreHGA Any update on this? I would really like to use this saving ability for saving the knn-classifier information returned by classifier.getClassifierDataset(). As far as I can tell it would just be saving an array of 2D tensors for which each tensor has a different shape. Anyone active on this?

@dsmilkov To save the knn-classifier information (array of 2D Tensors) could we store the array of tensors as a fake model? That would allow us to use model.save() and tf.loadModel() without having to make any changes.

I am just stuck on which type of keras layer would allow different tensor shapes (dense layers don't work as each layer gets it's shape from the previous layer), or would I have to make a custom model with multiple inputs (I can fully define the shape of an input). Each input having the different shape.

See my Face Detection demo for which every classifier has the 2D Tensor shape [x,136] where X is the number of images that have been classified for that person and 136 is the amount of data generated for each face (68 data points x 2 = 136).

hpssjellis commented 5 years ago

So my idea of making a fake model with multiple inputs to save each tensor from the knn-classifier does work.

Basically it takes the output from the classifier.getClassifierDataset()

which returns an array of tensors, then sets up a model with inputs and a dense layer for each trained classifier. That model can be saved using model.save() The saved model can then be loaded from a website and converted back into the classifier using classifier.setClassifierDataset() once the array of tensors have been extracted from the fake model.

Reply if you are interested in the code. I am still working on making the code a bit more readable. I had to first understand multiple inputs and here is a demo for that

seppestaes commented 5 years ago

@hpssjellis Thx!

hpssjellis commented 5 years ago

Took me a while to find the code. Here is the knn-classifier demo. Look for the buttons "save-classifier" and "load-classifier". You can just view-source or visit the github.

The github is here

mwkldeveloper commented 5 years ago

This is my simply implement for save an load knn and it works, hope can help anyone who need :)

save() {
    let dataset = this.classifier.getClassifierDataset()
    var datasetObj = {}
    Object.keys(dataset).forEach((key) => {
      let data = dataset[key].dataSync();
      // use Array.from() so when JSON.stringify() it covert to an array string e.g [0.1,-0.2...] 
      // instead of object e.g {0:"0.1", 1:"-0.2"...}
      datasetObj[key] = Array.from(data); 
    });
    let jsonStr = JSON.stringify(datasetObj)
    //can be change to other source
    localStorage.setItem("myData", jsonStr);
  }
  load() {
     //can be change to other source
    let dataset = localStorage.getItem("myData")
    let tensorObj = JSON.parse(dataset)
    //covert back to tensor
    Object.keys(tensorObj).forEach((key) => {
      tensorObj[key] = tf.tensor(tensorObj[key], [tensorObj[key].length / 1000, 1000])
    })
    this.classifier.setClassifierDataset(tensorObj);
  }
oveddan commented 5 years ago

I would find this useful as well. Based on @leung85 and @hpssjellis examples I've created a typescipt and async version:

import * as knnClassifier from "@tensorflow-models/knn-classifier";
import * as tf from '@tensorflow/tfjs';

type Dataset = {
  [classId: number]: tf.Tensor<tf.Rank.R2>
};

type DatasetObjectEntry = {
  classId: number,
  data: number[],
  shape: [number, number]
};

type DatasetObject = DatasetObjectEntry[];

async function toDatasetObject(dataset: Dataset): Promise<DatasetObject> {
  const result: DatasetObject = await Promise.all(
    Object.entries(dataset).map(async ([classId,value], index) => {
      const data = await value.data();

      return {
        classId: Number(classId),
        data: Array.from(data),
        shape: value.shape
      };
   })
  );

  return result;
};

function fromDatasetObject(datasetObject: DatasetObject): Dataset {
  return Object.entries(datasetObject).reduce((result: Dataset, [indexString, {data, shape}]) => {
    const tensor = tf.tensor2d(data, shape);
    const index = Number(indexString);

    result[index] = tensor;

    return result;
  }, {});

}

const storageKey = "knnClassifier";

async function saveClassifierInLocalStorage(classifier: knnClassifier.KNNClassifier) {
  const dataset = classifier.getClassifierDataset();
  const datasetOjb: DatasetObject = await toDatasetObject(dataset);
  const jsonStr = JSON.stringify(datasetOjb);
  //can be change to other source
  localStorage.setItem(storageKey, jsonStr);
}

function loadClassifierFromLocalStorage(): knnClassifier.KNNClassifier {
  const classifier: knnClassifier.KNNClassifier = new knnClassifier.KNNClassifier();

  const datasetJson = localStorage.getItem(storageKey);

  if (datasetJson) {
    const datasetObj = JSON.parse(datasetJson) as DatasetObject;

    const dataset = fromDatasetObject(datasetObj);

    classifier.setClassifierDataset(dataset);
  }
  return classifier;
}
hpssjellis commented 5 years ago

That's great @oveddan, any chance of that being sent as a PR to TFJS or is it too specific and should just be loaded as needed?

jonnytest1 commented 5 years ago

@leung85 newer versions use tensor size of 1024

josephrocca commented 4 years ago

Maybe it's obvious, but you can read the shape from the tensor object, so no need to hard-code in 1000 or 1024, etc. Based on leung85's example (sorry for long lines):

// Create your classifier:
let classifier = knnClassifier.create();
// Add some examples:
classifier.addExample(...);
// Save it to a string:
let str = JSON.stringify( Object.entries(classifier.getClassifierDataset()).map(([label, data])=>[label, Array.from(data.dataSync()), data.shape]) );
// Load it back into a fresh classifier:
classifier = knnClassifier.create();
classifier.setClassifierDataset( Object.fromEntries( JSON.parse(str).map(([label, data, shape])=>[label, tf.tensor(data, shape)]) ) );
VladimirHumeniuk commented 4 years ago

@oveddan I tried to do it as in your answer but after fromDatasetObject my data-set has undefined in classIndex. Could you please provide any advice on how to fix it?

Mxlt commented 4 years ago

@oveddan I tried to do it as in your answer but after fromDatasetObject my data-set has undefined in classIndex. Could you please provide any advice on how to fix it?

@VladimirHumeniuk Hi! Have you been able to fix this? I am having the same issue.

VladimirHumeniuk commented 4 years ago

@Mxlt I just used label instead of classIndex

swimauger commented 4 years ago

Noticed most of these answers are synchronous, which is potentially dangerous if you're expecting to unload a large dataset, so I have created a library for parsing and stringifying datasets of these types. If you are interested, take a look at tensorset. There is some documentation on how to use it, it works similar to JSON.stringify and JSON.parse.

Here is an example of using Tensorset with the KNN-Classifier:

const fs = require('fs').promises;
const Tensorset = require('tensorset');

(async () => {
    // Create a classifier, add your examples
    const originalClassifier = knnClassifier.create();
    originalClassifier.addExample(/*Some Example*/);

    // Stringify the dataset
    let dataset = Tensorset.stringify(originalClassifier.getClassifierDataset());

    // Save the dataset
    await fs.writeFile(/*File Name*/, dataset);

    // Load the dataset
    dataset = await fs.readFile(/*File Name*/);

    // Parse the dataset
    dataset = await Tensorset.parse(dataset);

    // Add to a new classifier
    const newClassifier = knnClassifier.create();
    newClassifier.setClassifierDataset(dataset);
})();

OR if your looking to build a image classifier from the knnClassifier you could use my image-classifier, which implements the save and load functionality, adding images as examples, and then classifying images. Slightly easier to implement imo.

JJwilkin commented 3 years ago

@swimauger This is really great, thank you for sharing these packages! I was wondering for the image-classifier package, does it use MobileNet under the hood for feature extraction and the package you provide simply wrapping the mobilenet + KnnClassifier so it can take a jpg/png image as input?

swimauger commented 3 years ago

@JJwilkin precisely!

408881465 commented 2 years ago

Maybe it's obvious, but you can read the shape from the tensor object, so no need to hard-code in 1000 or 1024, etc. Based on leung85's example (sorry for long lines):

// Create your classifier:
let classifier = knnClassifier.create();
// Add some examples:
classifier.addExample(...);
// Save it to a string:
let str = JSON.stringify( Object.entries(classifier.getClassifierDataset()).map(([label, data])=>[label, Array.from(data.dataSync()), data.shape]) );
// Load it back into a fresh classifier:
classifier = knnClassifier.create();
classifier.setClassifierDataset( Object.fromEntries( JSON.parse(str).map(([label, data, shape])=>[label, tf.tensor(data, shape)]) ) );

Thx. It works for me.

rthadur commented 2 years ago

This feature is stale and the workaround mentioned works here , we will be closing the issue for now. feel free @mention so that we can reopen the issue. Thank you.