kebasaa / SCIO-read

Read data from the SCIO spectrometer
GNU General Public License v3.0
31 stars 4 forks source link

GitHub

Read the SCiO spectrometer built by Consumer Physics

In this small project, I'm trying to create a Python library to scan and interpret data from the the SCiO spectrometer. As I'm not very experienced, this is first going to be a documentation effort of the device, and hopefully in the future the code will work. Any input and help is appreciated.

IMPORTANT, I NEED YOUR HELP: The SCiO sends raw measurements to a server online as bytes coded in Base64. The server then returns the data as JSON. However, it is unclear how the 1800 bytes of the sample reading, 1800 bytes of sampleDark and 1656 bytes of sampleGradient are turned into 331 float values representing a normalised reflectance spectrum from 0-1. If you have any insights, please let me know.

Further: It appears that the raw bytes reported by the SCiO don't correspond to the Base64 string sent to the Consumer Physics server. There might be some encryption happening in between these stages, can someone help decrypt it based on this information? See 02_extract_log_scan.ipynb to obtain the data from log files or check the folder 01_rawdata/log_extracted/

DISCLAIMER: All this code is experimental. I am trying to reverse-engineer the device in order to read the reflectance spectrum, but any help is appreciated! Scan data can't currently be fully decoded, this is an area where help is particularly appreciated

Changelog

Usage of the code in this repository

  1. Connect the SCiO to your computer (currently supports Windows) through USB
  2. Run the code in 01_scio_usb.ipynb to calibrate and scan. This can save files with raw data
  3. You can extract scan data from logs using 02_extract_log_scan.ipynb. What is interesting is that the raw bytes reported by the SCiO don't correspond to the Base64 string sent to the server. Is some encryption happening in between these stages?
  4. Run 03_scio_analyse_devel.ipynb to attempt to decode the data

NOTE: Base64 data generated from parsed log files is different from that generated by scans. The Base64 type generated by scans is urlsafe, while the one in the log files is not.

Documentation of the SCiO device

The following is an attempt to document as much as possible of the SCiO's functioning for the reverse-engineering effort. Any additional information is appreciated.

Hardware & device specifications

The specs are rather badly documented. The following information is known so far:

Measurement principle

The SCiO illuminates the sample with a light and measures the reflected light in a number of wavelengths. This measured spectrum is then used in large online databases to identify the content of the sample. Obviously, the code and documentation in this repository is trying to gain access raw scan data for research purposes, i.e. access to the online tools is not an aim.

Based on US patent US9377396B2, each resulting scan is likely a file to be an image, not a spectrum directly. Another US patent, US10330531B2, confirms that the raw data is both compressed and encrypted: "… the compressed encrypted raw data signal can be transmitted via Bluetooth to the handheld device. Compression of raw data may be necessary since raw intensity data will generally be too large to transmit via Bluetooth in real time. … The data generated by the optical system described herein typically contains symmetries that allow significant compression of the raw data into much more compact data structures". This data is analysed by the online server in order to provide a spectrum.

According to Consumer Physics, the SCiO app with a developer license (which I don't have) can output raw data as CSV divided into three parts: The spectrum, wr_raw and sample_raw (from their forums). The first part is the reflectance spectrum (R) – how much of the light is reflected back by the sample. The second part is the raw signal from the sample (S), and the third is the raw signal from the calibration (C). In order to calculate reflectance, the equation is: R=S/C.

It appears that for every scan, the SCIO measures twice. It probably then takes the mean between the 2 scans. Every SCIO bluetooth LE message contains 3 parts: sample, sampleDark and sampleGradient (No clue so far what that those mean or how to convert them). Calibration is done by scanning the calibration box, and comparing a scan with that calibration scan.

Sample identification

Consumer Physics described the process as follows in their forum: The spectrometer breaks down the light to its spectrum (the spectra), which includes all the information required to detect the result of this interaction between the illuminated light and the molecules in the sample. This means that SCiO analyses the overall spectra that is received and, comparing it to different algorithms and information provided, identifies or evaluates it.

For example, if you know the basic spectra of a watermelon, and then see that as the watermelon gets sweeter, meaning it has more sugar content, the spectrum gradually changes in a specific manner, you will be able to build an algorithm in accordance. In recognizing the existence of a specific material, such as ginger, in a sample, you will need to see if the reflectance of the material changes in a specific manner when the ginger is present. Thus, you will need two samples of the material – with and without ginger.

In order to achieve good results, large databases of materials and their properties are necessary. Usually, machine learning assists the identification. For example for tomatoes, 40 samples are recommended as a rule of thumb as a properly sized collection for a feasibility test. However, a comprehensive application should be based on hundreds of samples and thousands of scans.

SCiO communication protocol and BLE (Bluetooth LE) handles

BLE handles

The following BLE UUIDs/handles have been identified so far

    01ba050000 // inquire battery status
    01ba0e0000 // Ready for WR
    01ba0b0900000000000000000000  // set LED (is this a colour?)
    01ba040000 // inquire device temperature before
    01ba020000 // This is the actual scanning command
    01ba040000 // inquire device temperature after

USB control

Commands can be sent to the USB port by sending bytes corresponding to the above commands sent to the BLE scanning handle described above, using the same protocol.

Data protocol

Raw response messages from the SCiO are structured as follows:

Command (int) hex Meaning handle & message format
-70 ba Incoming data protocol see above
2 02 Data type: spectrum part of protocol above
4 04 Temperature contains tempeature of chip, cmos sensor (by Aptina) & object (always 0)
5 05 Battery state contains charge %, battery health, voltage, etc.
11 0b Set LED status ?
14 0e Read for WR ?
-108 94 File list (likely firmware) file identifiers as integers
-111 91 Set device name
-121 87 File header file headers as integers
-123 85 BLE status
-124 84 BLE ID
-125 83 Reset device

How the raw scan data can be decoded is currently still unknown.

To help with the reverse-engineering effort, the following data is available:

Instructions for reading raw data (without the script in this repository)

Through USB on the console

  1. Connect the SCIO to your computer with a USB cable, and turn it on

  2. On Linux, open the console and type to read data to "file.txt"

    cat /dev/ttyACM0 | hexdump -C > file.txt
  1. In a second console window, type your command with a \x between each byte, for example for the temperature reading type
    echo -n -e "\x01\xba\x04\x00\x00" > /dev/ttyACM0
  1. Wait a moment, until the SCIO stops blinking. Then go to the first console window and hit Ctrl+C to stop reading from the serial port. You will now have your readings in "file.txt"

Through bluetooth with gatttool

  1. On Linux, install gatttool and hcitool. I'm using Ubuntu, to install:
    sudo apt-get install bluez
  1. Turn on your SCIO with a long press on the button

  2. Run hcitool to find out what your SCIO's MAC address is. It will have a name like SCiOmyScio or whatever you named it:

    sudo hcitool lescan
  1. Run gatttool with your SCIO's MAC address to collect your own data. This will store it in "file1.txt". Replace xx:xx:xx:xx:xx:xx with the MAC address you found in step 3. During the scan, the SCIO indicator light will be yellow.
    sudo gatttool -i hci0 -b xx:xx:xx:xx:xx:xx --char-write-req -a 0x0029 -n 01ba020000 --listen > file1.txt
  1. Stop saving data to your file with Ctrl+C after the indicator light of the SCIO goes back to blue.

  2. In a text editor, edit your file1.txt: Remove the first line saying "Characteristic value was written successfully" and in the beginning of each line remove "Notification handle = 0x0025 value: ". Then save the file

License

Software

This software is distributed under the GPL version 3.

Logos and icons

All logos and icons are trademark of Consumer Physics.

Credits

My thanks go out to the following people: