harp-tech / protocol

Description of the Harp protocol.
https://harp-tech.org/protocol/BinaryProtocol-8bit.html
MIT License
3 stars 5 forks source link

Define a data logging/ingestion format and spec #41

Open bruno-f-cruz opened 3 months ago

bruno-f-cruz commented 3 months ago

Summary

One of the goals of the harp-ecossytem is to define data format and specifications to allow users to log their data in a stable and shareable format.

Current Implementations

At the Allen

The current implementation at the Allen follows the following pattern: https://allenneuraldynamics.github.io/Bonsai.AllenNeuralDynamics/articles/core-logging.html#harp-data

Essentially, all messages from a single device and GroupedBy Register and save in their respective binary file. The name of the binary file current follows the convention . e.g.:

├───Behavior.harp
│       Register__AnalogData.bin
│       Register__AssemblyVersion.bin
│       Register__Camera0Frame.bin
│       Register__Camera0Frequency.bin
│       Register__Camera1Frame.bin
│       Register__Camera1Frequency.bin
│       Register__ClockConfiguration.bin
│       Register__CoreVersionHigh.bin
│       Register__CoreVersionLow.bin
│       Register__DeviceName.bin
.....
├───ClockGenerator.harp
│       Register__AssemblyVersion.bin
│       Register__Battery.bin
│       Register__BatteryCalibration0.bin
│       Register__BatteryCalibration1.bin
│       Register__BatteryRate.bin
│       Register__BatteryThresholdHigh.bin
│       Register__BatteryThresholdLow.bin
│       Register__ClockConfiguration.bin
│       Register__Config.bin
│       Register__CoreVersionHigh.bin
│       Register__CoreVersionLow.bin
....

This has a few problems:

  1. it does not split by event/read/write. Which might be a problem given the last discussions about #37
  2. It does not work with the current spec of the harp-python package
  3. It does not include the yml metadata file making it difficult to recover the metadata associated with the device offline

Possible solutions

bruno-f-cruz commented 3 months ago

One thing that came to mind is why use the <DeviceName> to <UserGivenName>.harp / <DeviceName>_<RegisterNumber>.bin at all. It seems that it just introduces an extra dependency that is not necessary. Maybe a more general name, like Register is better? @glopesdev

glopesdev commented 3 months ago

@bruno-f-cruz This makes it easier when searching for chunks of the same device across epoch folders, as what happens in the Aeon data formats. I want to keep pushing for this, as I think it is an important use case to keep compatibility for, even though it may not be used in 90% of cases.

bruno-f-cruz commented 3 months ago

I guess my question is whether it should be part of the spec or not. From the Python interface point of view it doesn't appear to add much. I wonder if we can find a way that the interface works as long as the pattern is '*_' or if there is an advantage of introducing this dependency and locking the spec to it. To be clear: I am not against folding it in, just wonder if we really need to add it!