In this small project, I'm trying to create a Python library to scan and interpret data from the the SCiO spectrometer. As I'm not very experienced, this is first going to be a documentation effort of the device, and hopefully in the future the code will work. Any input and help is appreciated.
IMPORTANT, I NEED YOUR HELP: The SCiO sends raw measurements to a server online as bytes coded in Base64. The server then returns the data as JSON. However, it is unclear how the 1800 bytes of the sample reading, 1800 bytes of sampleDark and 1656 bytes of sampleGradient are turned into 331 float values representing a normalised reflectance spectrum from 0-1. If you have any insights, please let me know.
Further: It appears that the raw bytes reported by the SCiO don't correspond to the Base64 string sent to the Consumer Physics server. There might be some encryption happening in between these stages, can someone help decrypt it based on this information? See 02_extract_log_scan.ipynb to obtain the data from log files or check the folder 01_rawdata/log_extracted/
DISCLAIMER: All this code is experimental. I am trying to reverse-engineer the device in order to read the reflectance spectrum, but any help is appreciated! Scan data can't currently be fully decoded, this is an area where help is particularly appreciated
NOTE: Base64 data generated from parsed log files is different from that generated by scans. The Base64 type generated by scans is urlsafe, while the one in the log files is not.
The following is an attempt to document as much as possible of the SCiO's functioning for the reverse-engineering effort. Any additional information is appreciated.
The specs are rather badly documented. The following information is known so far:
AAAAA
, 1800 bytes long): Raw spectral data representing light reflected from the sample (or calibration target)AAAAA
, 1800 bytes long): Raw spectral data from the SCIO's internal dark current reference, i.e. the background signal when there is no lightbgAAA
, 1656 bytes long): Raw spectral data from the SCIO's internal white reference when measuring a known white reference"device_id":"8032AB45611198F1"
"sampled_at":"2021-10-20T10:58:58.729+03:00"
(Timestamp of current scan)"sampled_white_at":"2021-10-20T10:53:18.334+03:00"
(Timestamp of calibration scan)"scio_edition":"scio_edition"
"mobile_GPS":{"longitude":-----,"latitude":----,"locality":"-----","country":"-----","admin_area":"-----","address_line":"-----"
(This information should be private and should not matter for scan analysis, but it is transferred)"mobile_mac_address":"------"
(Phone MAC address. Again, this information should be private and doesn't matter for scan analysis)"i2s_tag_config":"20150812-e:PRODUCTION"
(Seems to be a hardware version)The SCiO illuminates the sample with a light and measures the reflected light in a number of wavelengths. This measured spectrum is then used in large online databases to identify the content of the sample. Obviously, the code and documentation in this repository is trying to gain access raw scan data for research purposes, i.e. access to the online tools is not an aim.
Based on US patent US9377396B2, each resulting scan is likely a file to be an image, not a spectrum directly. Another US patent, US10330531B2, confirms that the raw data is both compressed and encrypted: "… the compressed encrypted raw data signal can be transmitted via Bluetooth to the handheld device. Compression of raw data may be necessary since raw intensity data will generally be too large to transmit via Bluetooth in real time. … The data generated by the optical system described herein typically contains symmetries that allow significant compression of the raw data into much more compact data structures". This data is analysed by the online server in order to provide a spectrum.
According to Consumer Physics, the SCiO app with a developer license (which I don't have) can output raw data as CSV divided into three parts: The spectrum, wr_raw and sample_raw (from their forums). The first part is the reflectance spectrum (R) – how much of the light is reflected back by the sample. The second part is the raw signal from the sample (S), and the third is the raw signal from the calibration (C). In order to calculate reflectance, the equation is: R=S/C.
It appears that for every scan, the SCIO measures twice. It probably then takes the mean between the 2 scans. Every SCIO bluetooth LE message contains 3 parts: sample, sampleDark and sampleGradient (No clue so far what that those mean or how to convert them). Calibration is done by scanning the calibration box, and comparing a scan with that calibration scan.
Consumer Physics described the process as follows in their forum: The spectrometer breaks down the light to its spectrum (the spectra), which includes all the information required to detect the result of this interaction between the illuminated light and the molecules in the sample. This means that SCiO analyses the overall spectra that is received and, comparing it to different algorithms and information provided, identifies or evaluates it.
For example, if you know the basic spectra of a watermelon, and then see that as the watermelon gets sweeter, meaning it has more sugar content, the spectrum gradually changes in a specific manner, you will be able to build an algorithm in accordance. In recognizing the existence of a specific material, such as ginger, in a sample, you will need to see if the reflectance of the material changes in a specific manner when the ginger is present. Thus, you will need two samples of the material – with and without ginger.
In order to achieve good results, large databases of materials and their properties are necessary. Usually, machine learning assists the identification. For example for tomatoes, 40 samples are recommended as a rule of thumb as a properly sized collection for a feasibility test. However, a comprehensive application should be based on hundreds of samples and thousands of scans.
The following BLE UUIDs/handles have been identified so far
0x002c
reads a hex value 01
upon button press0x2a00
(equivalent to UUID 00002a00-0000-1000-8000-00805f9b34fb)0x0012
(uuid: 00002a23-0000-1000-8000-00805f9b34fb)01ba020000
to handle 0x0029
(uuid 00003492-0000-1000-8000-00805f9b34fb). The answer comes in on notification handle 0x0025
.0x0029
, see above) accepts a number of messages (protocol see below). The app sends the following before & after scanning: 01ba050000 // inquire battery status
01ba0e0000 // Ready for WR
01ba0b0900000000000000000000 // set LED (is this a colour?)
01ba040000 // inquire device temperature before
01ba020000 // This is the actual scanning command
01ba040000 // inquire device temperature after
Commands can be sent to the USB port by sending bytes corresponding to the above commands sent to the BLE scanning handle described above, using the same protocol.
Raw response messages from the SCiO are structured as follows:
01
), coming in 3 batches, from 01-5f, 01-5f and 01-58ba
or integer -70
) is a protocol identifier, to inform the app what protocol the following data is02
) defines that the incoming data is a spectral measurement. More commands, see table belowCommand (int) | hex | Meaning | handle & message format |
---|---|---|---|
-70 | ba | Incoming data protocol | see above |
2 | 02 | Data type: spectrum | part of protocol above |
4 | 04 | Temperature | contains tempeature of chip, cmos sensor (by Aptina) & object (always 0) |
5 | 05 | Battery state | contains charge %, battery health, voltage, etc. |
11 | 0b | Set LED status | ? |
14 | 0e | Read for WR | ? |
-108 | 94 | File list (likely firmware) | file identifiers as integers |
-111 | 91 | Set device name | |
-121 | 87 | File header | file headers as integers |
-123 | 85 | BLE status | |
-124 | 84 | BLE ID | |
-125 | 83 | Reset device |
How the raw scan data can be decoded is currently still unknown.
To help with the reverse-engineering effort, the following data is available:
Connect the SCIO to your computer with a USB cable, and turn it on
On Linux, open the console and type to read data to "file.txt"
cat /dev/ttyACM0 | hexdump -C > file.txt
echo -n -e "\x01\xba\x04\x00\x00" > /dev/ttyACM0
sudo apt-get install bluez
Turn on your SCIO with a long press on the button
Run hcitool to find out what your SCIO's MAC address is. It will have a name like SCiOmyScio or whatever you named it:
sudo hcitool lescan
sudo gatttool -i hci0 -b xx:xx:xx:xx:xx:xx --char-write-req -a 0x0029 -n 01ba020000 --listen > file1.txt
Stop saving data to your file with Ctrl+C after the indicator light of the SCIO goes back to blue.
In a text editor, edit your file1.txt: Remove the first line saying "Characteristic value was written successfully" and in the beginning of each line remove "Notification handle = 0x0025 value: ". Then save the file
This software is distributed under the GPL version 3.
All logos and icons are trademark of Consumer Physics.
My thanks go out to the following people: