jeeliz / jeelizAR

JavaScript object detection lightweight library for augmented reality (WebXR demos included). It uses convolutional neural networks running on the GPU with WebGL.
https://jeeliz.com
Apache License 2.0
361 stars 72 forks source link
ar arcore arkit augmented-reality deep javascript learning network neural object-detection object-recognition object-tracking real-time video vr web webrtc webxr

JavaScript/WebGL lightweight object detection and tracking library for WebAR

WARNING: this repository is deprecated and not maintained anymore. Please use WebAR.rocks.object instead.


Standalone AR Coffee - Enjoy a free coffee offered by Jeeliz!
The coffee cup is detected and a 3D animation is played in augmented reality.
This demo only relies on JeelizAR and THREE.JS.

Table of contents

Features

Here are the main features of the library:

Architecture

Demonstrations

These are some demonstrations of this library. Some requires a specific setup.

You can subscribe to the Jeeliz Youtube channel or to the @StartupJeeliz Twitter account to be kept informed of our cutting edge developments.

If you have made an application or a fun demonstration using this library, we would love to see it and insert a link here! Contact us on Twitter @StartupJeeliz or LinkedIn.

Standard browser demos

These demonstrations work in a standard web browser. They only require webcam access.

WebXR viewer demos

To run these demonstrations, you need a web browser implementing WebXR. We hope it will be implemented soon in all web browsers!

Then you can run these demos:

8thWall demos

These demos run in a standard web browser on mobile or tablet. They rely on the amazing 8th Wall AR engine. We use the web version of the engine and we started from the THREE.JS web sample. The web engine is not released publicly yet, so you need to:

The demo:

Specifications

Get started

The most basic integration example of this library is the first demo, the debug detection demo. In index.html, we include in the <head> section the main library script, /dist/jeelizAR.js, the MediaStramAPI (formerly called getUserMedia API) helper, /helpers/JeelizMediaStreamAPIHelper.js and the demo script, demo.js:

<script src = "../../dist/jeelizAR.js"></script>
<script src = "../../helpers/JeelizMediaStreamAPIHelper.js"></script>
<script src = "demo.js"></script>

In the <body> section of index.html, we put a <canvas> element which will be used to initialize the WebGL context used by the library for deep learning computation, and to possibly display a debug rendering:

<canvas id = 'debugJeeARCanvas'></canvas>

Then, in demo.js, we get the Webcam video feed after the loading of the page using the MediaStream API helper:

JeelizMediaStreamAPIHelper.get(DOMVIDEO, init, function(){
  alert('Cannot get video bro :(');
}, {
  video: true //mediaConstraints
  audio: false
})

You can replace this part by a static video, and you can also provide Media Contraints to specify the video resolution. When the video feed is captured, the callback function init is launched. It initializes this library:

function init(){

  JEEARAPI.init({
    canvasId: 'debugJeeARCanvas',
    video: DOMVIDEO,
    callbackReady: function(errLabel){
      if (errLabel){
        alert('An error happens bro: ',errLabel);
      } else {
        load_neuralNet();
      }
    }
  });

}

The function load_neuralNet loads the neural network model:

function load_neuralNet(){
  JEEARAPI.set_NN('../../neuralNets/basic4.json', function(errLabel){
    if (errLabel){
      console.log('ERROR: cannot load the neural net', errLabel);
    } else {
      iterate();
    }
  }, options);
}

Instead of giving the URL of the neural network, you can also give the parsed JSON object.

The function iterate starts the iteration loop:

function iterate(){
  var detectState = JEEARAPI.detect(3);
  if (detectState.label){
    console.log(detectState.label, 'IS DETECTED YEAH !!!');
  }
  window.requestAnimationFrame(iterate);
}

Initialization arguments

The JEEARAPI.init takes a dictionary as argument with these properties:

The Detection function

The function which triggers the detection is JEEARAPI.detect(<int>nDetectionsPerLoop, <videoFrame>frame, <dictionary>options).

The detection function returns an object, detectState. For optimization purpose it is assigned by reference, not by value. It is a dictionary with these properties:

Other methods

WebXR integration

The WebXR demos principal code is directly in the index.html files. The 3D part is handled by THREE.JS. The starting point of the demos is the examples provided by [WebXR viewer by the Mozilla Fundation](github repository of demos).

We use Jeeliz AR through a specific helper, helpers/JeelizWebXRHelper.js and we strongly advise to use this helper for your WebXR demos. With the IOS implementation, it handles the video stream conversion (the video stream is given as YCbCr buffers. We take only the Y buffer and we apply a median filter on it.).

Error codes

Video cropping

The video crop parameters can be provided. It works only if the input element is a <video> element. By default, there is no video cropping (the whole video image is taken as input). The video crop settings can be provided:

The dictionnary videoCrop is either false (no videoCrop), or has the following parameters:

Scan settings

Scan settings can be provided:

The dictionnary scanSettings has the following properties:

Hosting

The demonstrations should be hosted on a static HTTPS server with a valid certificate. Otherwise WebXR or MediaStream API may not be available.

Be careful to enable gzip compression for at least JSON files. The neuron network model can be quite heavy, but fortunately it is well compressed with GZIP.

Using the ES6 module

/dist/jeelizAR.module.js is exactly the same than /dist/jeelizAR.js except that it works with ES6, so you can import it directly using:

import 'dist/jeelizAR.module.js'

Neural network models

We provide several neural network models in the /neuralNets/ path. We will regularly add new neural networks in this Git repository. We can also provide specific neural network training services. Please contact us here for pricing and details. You can find here:

model file detected labels input size detection cost reliability remarks
basic4.json CUP,CHAIR,BICYCLE,LAPTOP 128*128px ** **
basic4Light.json CUP,CHAIR,BICYCLE,LAPTOP 64*64px * *
cat.json CAT 64*64px *** *** detect cat face
sprite0.json SPRITECAN 128*128px *** *** standalone network (6D detection)
ARCoffeeStandalone01.json CUP 64*64px ** *** standalone network (6D detection)

The input size is the resolution of the input image of the network. The detection window is not static: it slides along the video both for position and scale. If you use this library with WebXR and IOS, the video resolution will be 480*270 pixels, so a 64*64 pixels input will be enough. If for example you used a 128*128 pixels input neural network model, the input image would often need to be enlarged before being given as input.

About the tech

Under the hood

This library uses Jeeliz WebGL Deep Learning technology to detect objects. The neural network is trained using a 3D engine and a dataset of 3D models. All is processed client-side.

Compatibility

If a compatibility error is triggered, please post an issue on this repository. If this is a problem with the webcam access, please first retry after closing all applications which could use your device (Skype, Messenger, other browser tabs and windows, ...). Please include:

License

Apache 2.0. This application is free for both commercial and non-commercial use.

We appreciate attribution by including the Jeeliz logo and a link to the Jeeliz website in your application or desktop website. Of course we do not expect a large link to Jeeliz over your face filter, but if you can put the link in the credits/about/help/footer section it would be great.

See also

Jeeliz main face detection and tracking library is cJeeliz FaceFilter API. It handles multi-face tracking, and for each tracked face it provides the rotation angles and the mouth opening factor. It is perfect to build your own Snapchat/MSQRD like face filters running in the browser. It comes with dozen of integration demo, including a face swap.

Our deep learning based library Weboji detects 11 facial expressions in real time from the webcam video feed. Then they are reproduced on an avatar, either in 3D with a THREE.JS renderer or in 2D with a SVG renderer (so you can use it even if you are not a 3D developer). You can access to the github repository here.

If you just want to detect if the user is looking at the screen or not, Jeeliz Glance Tracker is what you are looking for. It can be useful to play and pause a video whether the user is watching or not. This library needs fewer resources and the neural network file is much lighter.

If you want to use this library for glasses virtual try-on (sunglasses, spectacles, ski masks), you can take a look at Jeeliz VTO widget. It includes a high quality and lightweight 3D engine which implements the following features: deferred shading, PBR, raytraced shadows, normal mapping, ... It also reconstructs the lighting environment around the user (ambient and directional lighting). But the glasses comes from a database hosted in our servers. If you want to add some models, please contact us.

References