nqminds / nqm-iot-database-py

nqm-iot-database-py
https://nqminds.github.io/nqm-iot-database-py/
0 stars 0 forks source link

Python type to Javascript type conversion #12

Open mereacre opened 5 years ago

mereacre commented 5 years ago

We need to convert from Javascript types:

Data type String
Int8Array "int8"
Int16Array "int16"
Int32Array "int32"
Uint8Array "uint8"
Uint16Array "uint16"
Uint32Array "uint32"
Float32Array "float32"
Float64Array "float64"
Array "array"
Uint8ArrayClamped "uint8_clamped"
Buffer "buffer"
Other "generic"

to Python types.

mereacre commented 5 years ago

Adds the matrix datatype.

To use, define your field with the __tdxType: ["ndarray"] and addData a numpy.array, ie:

from nqm.iotdatabase.database import Database
import numpy as np
db = Database("", "memory", "w+")
db.createDatabase(schema={
  "dataSchema":{
    "k": {"__tdxType": ["number"]},
    "a": {"__tdxType": ["ndarray"]}, #new ndarray type
  },
  "uniqueIndex": [{"asc":"k"}]
})
db.addData([
  {"k": 0, "a": np.array([1, 2])},
  {"k": 1, "a": np.array([3, 4])},
  {"k": 2, "a": np.array([[5, 6], [7, 9]])}, # 2d array
])

Each ndarray is stored as a JSON string in the dataset, ie:

example = {
  "t": "=h", // numpy typestring, h means signed 16-bit int with native alignment
  "s": [766, 480], // shape of array, means 766 x 480 (2d)
  "v": "f", // metadata version, f means p is a pointer to an uncompressed binary file
  "c": arr.strides, // the value of teh strides for the numpy array arr
  // filename can be anything, but currently it is being generated by
  //   vvvvvvvvvvvvv - base64 unix timestamp in ms, this means files are in alphabet chronological order
  //                vvvvvvvv - pseudorandom prefix to avoid clashes if there is the same timestamp
  //                        vvvv - static suffux
  "p": "AAABaBQNuQI=s8ffou_6.dat" // binary file location
  // when loading the filename, the path /path/to/database.suffix + .d/ are prepended to it
  // ie, if p is example.dat, and db is test.sqlite, we will load test.sqlite.d/example.dat
}

@aloisklink Instead of setting ["c"] to true or false it is better to set it to the actual stride value using arr.strides. It will be easier to translate it to javascript.

For example:

X = np.array([[0,1,2],[3,4,5]], dtype='int16')
print(X.strides) # result: (6, 2)
["c"] = X.strides / sizeof(int16)
print(["c"]) # result: (3, 1)
aloisklink commented 5 years ago

It's pretty easy to convert it (so we keep the metadata as compressed as possible)

Where shape = [s(0), s(1),...,s(x-1), s(x)]

Note for C-order you ignore the first shape dimension, and for F-order, you ignore the last shape dimension.

If C-order, strides = [s(1)*s(2)*...s(x), ..., s(x-1)*s(x), s(x), 1] If F-order, strides = [1, s(0), s(0)*s(1),...., s(x-1)*s(x-2)*...*s(0)]

Sorry for this mess, I'm writing this from my phone. I can add the JS code for this on Wednesday if you want.

On Mon, Jan 21, 2019, 16:11 Alexandru Mereacre notifications@github.com wrote:

Adds the matrix datatype.

To use, define your field with the __tdxType: ["ndarray"] and addData a numpy.array, ie:

from nqm.iotdatabase.database import Databaseimport numpy as np db = Database("", "memory", "w+") db.createDatabase(schema={ "dataSchema":{ "k": {"tdxType": ["number"]}, "a": {"tdxType": ["ndarray"]}, #new ndarray type }, "uniqueIndex": [{"asc":"k"}] }) db.addData([ {"k": 0, "a": np.array([1, 2])}, {"k": 1, "a": np.array([3, 4])}, {"k": 2, "a": np.array([[5, 6], [7, 9]])}, # 2d array ])

Each ndarray is stored as a JSON string in the dataset, ie:

example = { "t": "=h", // numpy typestring, h means signed 16-bit int with native alignment "s": [766, 480], // shape of array, means 766 x 480 (2d) "v": "f", // metadata version, f means p is a pointer to an uncompressed binary file "c": arr.strides, // the value of teh strides for the numpy array arr // filename can be anything, but currently it is being generated by // vvvvvvvvvvvvv - base64 unix timestamp in ms, this means files are in alphabet chronological order // vvvvvvvv - pseudorandom prefix to avoid clashes if there is the same timestamp // vvvv - static suffux "p": "AAABaBQNuQI=s8ffou_6.dat" // binary file location // when loading the filename, the path /path/to/database.suffix + .d/ are prepended to it // ie, if p is example.dat, and db is test.sqlite, we will load test.sqlite.d/example.dat }

@aloisklink https://github.com/aloisklink Instead of setting ["c"] to true or false it is better to set it to the actual stride value using arr.strides. It will be easier to translate it to javascript.

For example:

X = np.array([[0,1,2],[3,4,5]], dtype='int16')print(X.strides) # result: (6, 2) ["c"] = X.strides / sizeof(int16)print(["c"]) # result: (3, 1)

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/nqminds/nqm-iot-database-py/issues/12#issuecomment-456125602, or mute the thread https://github.com/notifications/unsubscribe-auth/ASzaQxIoVZMcc2dSmvNQnqPg66mLgFjmks5vFebNgaJpZM4aKsHj .