reconstrue / cuboids

Cloud-native neuroscientific volumetric cuboids processed over HTTP
http://reconstrue.com
Apache License 2.0
0 stars 0 forks source link

Reconstrue Cuboids

img

Table of Contents

  1. Introduction
  2. Data Structure
  3. Scope
  4. Rational

Introduction

Reconstrue Cuboids integrates various open-source codebases that deals with volumetric imaging data generated by neuroscience experiments. All code used in Reconstrue Cuboid projects is licensed in a commercially friendly manner, including those parts written by Reconstrue. This means that any GPL licensed code is right out.

Although the main technical focus is on APIs and interfaces, in terms of implementational details the core distinguishing feature of Reconstrue Cuboids is the focus on AWS serverless technologies, which enabled bossDB's success in the MICrONS project. Cuboids continues what bossDB started and doubles down on the AWS lock-in, for now.

Data Structure

As the name implies, the core data abstraction is cuboids. A cuboid, as defined by Wolfram, is a "closed box composed of three pairs of rectangular faces placed opposite each other and joined at right angles to each other."

In this project, identically shaped cuboids are used to (3D) tessellate a volume. A voxel is an experiment's unit, indivisible cuboid; cuboids can also be assemblies of voxels. Cuboid data can be <a href="https://www.dentalcare.com/en-us/professional-education/ce-courses/ce531/voxel"

anisotropic, as required by various experimental modalities. The tiling unit is an anisotropic voxel (ergo, "Cuboids" rather than "Cubes").

Cuboids as defined by bossDB are "multi-channel 3-dimensional image volumes" which means the core data structure has a 4-dimensional address space: three of the dimensions are a voxel's address in space (X, Y, and Z) and the final dimension is 1-D array of channel values i.e. various attributes of a given voxel.

A cuboid as defined by Neuroglancer's precomputed format is similarly shaped: "Each subvolume is conceptually a 4-dimensional [x, y, z, channel] array."

Clearly, a 4-dimensional array is a data structure which can represent neuroimaging datasets in implementations both dynamic (bossDB RESTful) and static (neuroglancers precomputed cuboids format). Therefore, Reconstrue Cuboids is software which deals with 4-dimensional cuboids.

Note: it is not necessary to think of the datastructure as a tesseract, rather simply as a tiling of a 3D cuboid representing a volume of physical 3D space. Each voxel can have multiple attributes a.k.a channels.

Scope

Reconstrue Cuboids is not limited to any specific implementation. The subject is software which can handle the massive scale and velocity of data as required by trends in neuroscience. For example, the MICrONS project involved a single experimental dataset on one cubic millimeter of brain, which resulted in ~2.5 petabyte of data.

The focus is on the data structure, which is 4D cuboids. Implementations range from static file representations (e.g. cloud-volume's precomputed fixed cuboids stored in static files) to RESTful services (e.g. bossDB which can serve up cuboids of arbitrary coordinates and scale).

Some of the software can handle petabytes of data on commercial cloud services (e.g. BossDB on AWS) and some only run on a single machine (e.g. Bossphorus).

By focusing on data structures, wire protocols, and APIs various solutions can be developed which can deal with various hardware deployments and scale when necessary.

The following are codebases that have been adopted into Reconstrue Cuboids:

Codebase License
bossDB Apache 2.0
chunkflow Apache 2.0
cloud-volume BSD 3-Clause
ingest-client Apache 2.0
intern Apache 2.0
SABER Apache 2.0
JupyterHub BSD 3-Clause
bossphorus Apache 2.0

Rational

Most of the software labeled "Reconstrue Cuboids" is actually third party i.e. written by people who are not part of the organization, Reconstrue. Initially "Reconstrue Cuboids" is simply a label on various source code denoting that it has been exercised by Reconstrue to see if it is suitable to task. These are tools that have been used together in projects. As such, Reconstrue's early contribution is primarily in working out component integration, in aspects both technical and legal. For example, one project involves a system which uses bossDB with chunkflow to reconstruct neurons from brightfield microscope image stacks.

Beyond tactical progress on any given project, the strategic community value of this Reconstrue Cuboids exercise is that corporate entities (Reconstrue included) can build upon the code and not have to consult with lawyers up-front (of course, talk to your lawyers; no warranty implied herein). The main goal is to increase the rate of open source innovation in this field. For more on the rational behind this, see Reconstrue Stack.