ml5js / ml5-library

Friendly machine learning for the web! 🤖
https://ml5js.org
Other
6.46k stars 900 forks source link

Duplicated Code Blocks: Extracting HTMLVideoElement #1189

Open lindapaiste opened 3 years ago

lindapaiste commented 3 years ago

I downloaded the source code and started playing with it for the purpose of adding Typescript types, but I'm actually seeing a lot of things in this codebase that I can improve! My IDE is warning me about a whole bunch of duplicated code blocks and this seems like an easy place to start. The biggest source of duplicated code comes from extracting the video and callback arguments from arguments like videoOrOptionsOrCallback which can take multiple forms. There is identical code used in multiple models so this logic needs to be extracted outside of the individual model.

Example: there is a large block of duplicate code between FaceApi lines 472-499 and BodyPix lines 427-453.

bomanimc commented 3 years ago

Hi @lindapaiste! Agreed, this is definitely an area of the library that needs improvement. I think this would also be a fairly impactful change because it would reduce the length of all of our model classes (making them easier to read), cut down our bundle size a bit, and also put us in a better position to make this logic easier to unit test. I've discussed this general challenge in the codebase in https://github.com/ml5js/ml5-library/issues/946.

@lindapaiste would you like to take on generalizing this function? I'd be happy to support with PR reviews! If so, please join us on the #maintenance channel on Discord! https://discord.gg/eQvyeFYz

lindapaiste commented 3 years ago

Yes I would be interested in taking this on myself. I can create and/or improve certain utility functions for extracting arguments. In terms of the exports from each model, I am picturing a higher-order function that can take the class as an argument. Something like:

const createVideoModel = (class) => (videoOrOptionsOrCallback, optionsOrCallback, cb) => {
  /* ... logic for extracting args: video, options, callback */

  // can instantiate the class from a variable
  const instance = new class(video, options, callback);
  return callback ? instance : instance.ready;
}

Such that in the model class you can just do:

export default createVideoModel(FaceApi);

Which would export a function that takes args (videoOrOptionsOrCallback, optionsOrCallback, cb)

But there's a lot of duplication within the classes too!

lindapaiste commented 3 years ago

I'm still working on this so I wanted to give a check-in. I think there is an opportunity to refactor this whole library but for right now I am focused on the low-hanging fruit.

These functions allow arguments to be skipped over so every time it is called we have to check if the first argument is an image/video, and options object, or a callback. So I have created a utility that takes arguments and assigns them to named properties. In each model I can delete a big chunk of code and replace it with:

// in methods
const {image, options, callback} = new ImageModelArgs(imageOrOptionsOrCallback, optionsOrCallback, cb);
// in exported constructor
const {video, options, callback} = new ImageModelArgs(videoOrOptionsOrCallback, optionsOrCallback, cb);

Things that still need more work:


Implementation

The isInstanceOfSupportedElement function already existed but all the rest is new 😊 We decided not to use TypeScript but I am using a lot of JSDoc @typedef. And also I do have TypeScript in these files but we can remove it.

/**
 * A union of element types which can be used as sources
 * @typedef {(HTMLVideoElement | HTMLImageElement | HTMLCanvasElement)} MediaElement
 * // TODO: what about ImageBitmap? SVGImageElement?
 */

/**
 * A union of element types which can be used as sources by TensorFlow.
 * Includes HTML elements as well as raw data.
 * @typedef {(HTMLVideoElement | HTMLImageElement | HTMLCanvasElement | ImageData | tf.Tensor3D)} TfImageSource
 */

/**
 * Checks if a subject is an element that can be used as a data source
 *
 * @param subject
 * @return boolean - true if subject is one of HTMLVideoElement, HTMLImageElement, HTMLCanvasElement, ImageData, tf.Tensor
 */
export const isInstanceOfSupportedElement = (subject: any): subject is TfImageSource => {
  return (subject instanceof HTMLVideoElement
      || subject instanceof HTMLImageElement
      || subject instanceof HTMLCanvasElement
      || subject instanceof ImageData
      || subject instanceof tf.Tensor)
}

/**
 * Helper function to extract a data source from a variety of formats.
 *
 * @param {*} subject
 *  - a supported element (HTMLVideoElement, HTMLImageElement, HTMLCanvasElement, ImageData).
 *  - a p5.js image, which is an object with an `elt` or `canvas` property.
 *  - other types are accepted, but will return `undefined`
 *
 * @return {TfImageSource | undefined} returns a valid source, or undefined if no such source can be found
 */
export const extractImageElement = (subject: any): TfImageSource | undefined => {
  // return valid elements
  if (isInstanceOfSupportedElement(subject)) {
    return subject;
  }
  // Handle p5.js image
  else if (subject && typeof subject === 'object') {
    if (isInstanceOfSupportedElement(subject.elt)) {
      return subject.elt;
    } else if (isInstanceOfSupportedElement(subject.canvas)) {
      return subject.canvas;
    }
  }
  // will return undefined if no valid source
  return undefined;
}

// TODO: generalize to support method calls
export class InvalidVideoArgError extends TypeError {
    public readonly arg: any;

    constructor(arg: any, i?: number) {
        const message = `Invalid argument passed to model constructor${i === undefined ? '.' : `in position ${i} (zero-indexed).`}.
      Received value: ${String(arg)}.
      Argument must be one of the following types: an HTML video element, a p5 video element, an options object, a callback function.`;
        super(message);
        this.name = 'InvalidVideoArgumentError';
        this.arg = arg;
    }
}

/**
 * Helper utility to parse an array of arguments into known properties
 *
 * All properties are optional as an arguments might be missing
 * @property {TfImageSource} [image]
 * @property {HTMLVideoElement} [video]
 * @property {Object} [options]
 * @property {function} [callback]
 */
export class ImageModelArgs<Callback extends Function, Options extends object = {}> {

    image?: TfImageSource;
    video?: HTMLVideoElement;
    options?: Options;
    callback?: Callback;

    /**
     * Arguments used to CREATE an image-based model can be:
     *  - video: an HTMLVideoElement or p5 video.
     *  - options: an object of options specific to this model.
     *  - callback: a function to run once the model has been loaded. Called with arguments (error) or (error, result).
     *
     * Arguments used to CALL a method an image-based model can be:
     *  - image: an image or video element or an ImageData object.  Valid types: HTMLImageElement, HTMLCanvasElement,
     *    HTMLVideoElement, ImageData, p5 image, p5 video.
     *  - options: an object of options specific to this model.
     *  - callback: a function to run once the method has been completed.
     *
     * Expected to be provided in order video/image, options, callback with any omitted.
     * This function does not actually require any particular order.
     *
     *  @param {(ImageArg | VideoArg | Object | function)[]} args
     */
    constructor(...args: Array<Callback | Options | ImageArg | VideoArg>) {
        args.forEach(this.addArg);
    }

    /**
     * Can add arguments through the constructor or at any time after construction.
     *
     * @param {(ImageArg | VideoArg | Object | function)} arg - a video, callback, or options object
     * @param {number} [index] - optional number used in error messages
     */
    addArg(arg: Callback | Options | ImageArg | VideoArg, index?: number): void {
        // skip over falsey arguments and don't throw any error, assuming that these are omissions
        // do this check first to prevent accessing properties on null, which is an object
        if ( ! arg ) {
            return;
        }
        if (typeof arg === "function") {
            this.callback = arg;
        } else if (typeof arg === "object") {
            // Videos are also images, but images are not all videos,
            // so keep separate properties but store videos in both
            const image = extractImageElement(arg);
            if (image) {
                this.image = image;
                if ( image instanceof HTMLVideoElement ) {
                    this.video = image;
                }
            }
            // objects which are not images are assumed to be options
            else {
                this.options = arg;
            }
        } else {
            // Notify user about invalid arguments (would be ok to just skip)
                throw new InvalidVideoArgError(arg, index);
        }
    }
}