kissake / unattended_data_collection

This is a project intended to support unattended data collection in a way that protects the data being collected. The initial project is aimed at audio recording using a Raspberry Pi Zero (simple, low hanging fruit), with the intention of being extensible for other uses.
GNU General Public License v3.0
0 stars 0 forks source link

It is difficult to find interesting data in a large corpus. #5

Open kissake opened 2 years ago

kissake commented 2 years ago

Given that someone is likely to sleep for 8 hours at a time, and reviewing 8 hours of data (audio recordings) in 10 minute increments is ... a lot (could take more than 8 hours), it is important to facilitate review.

kissake commented 2 years ago

Possible approaches to mitigate this issue include:

kissake commented 2 years ago

Another tool to simplify locating useful information is to combine sequential segments of data into larger batches so that they can be processed at once. If other indicators can be retained (e.g. relative timestamps?), that would be ideal. Note that at this time the audio recording is explicitly cut early, with ~5 seconds of "grace", to avoid the next cron started recording from failing because the previous recording still holds the lock on the audio devices. This will cause some skew that will (for large number of sequential recordings) result in timestamps not being maintained if the recordings are simply concatenated.

kissake commented 2 years ago

A strategy for mitigation:

The ideal would be to extract this data along with timestamp information into individual files, I think?