matiasandina / FEDWatcher

Software and Hardware to connect FED3 devices over serial on Raspberry Pi 4
MIT License
4 stars 1 forks source link

Data corruption #29

Open matiasandina opened 2 years ago

matiasandina commented 2 years ago

This issue will try to put examples of data corruption affecting FEDWatcher functioning

1. Data corruption in FED number

FED 17 sent this string in the place of Session_type.

Pav\x83SPlusMinusl17

Problem

There's the more common data corruption of changing one character by \x[0-9]{2}. There's the speciffic problem of not splitting the 17 (Device_Number) and the program title. It then populated Device_Number with the value in Battery_Voltage

This happened in the middle of the session. Which means this row had to be manually inserted in the proper place on the proper table.

This error was not a one time event, several FEDWatcher had this. Not sure if related but the RPi were being pushed at 100% CPU usage. I would think it's data corruption on the emission not on the receiver side.

matiasandina commented 2 years ago

We could potentially implement a data cleanup routine. We could send each message 3 times and take the "average" of each character. This will likely reduce gibberish events. It will probably increase latency.

matiasandina commented 1 year ago

Here's an example. Instead of sending FreefFeed, 34, 3.97, ... It's sending FreeFeed\x06l34, which breaks the int() call somewhere else and stops the proper data saving

05/18/2023 06:02:32,1.9.2,FreeFeed\x06l34,3.97,1,1,Pellet,Left,31,76,59,0,20.49,31,nan
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 234, in runHelper
    self._save_all_df()
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 452, in _save_all_df
    self._save_to_csv(df_data)
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 417, in _save_to_csv
    filename = f"FED{int(df_data[0]['Device_Number']):03d}_{timestr}_{self.session_num:02d}.csv"
ValueError: invalid literal for int() with base 10: '3.97'
matiasandina commented 1 year ago

It might have been that the issue was that there wasn't enough space for the termination character. Maybe your classic off-by-one error ? Will test before closing

Update

I see the errors continue only when the Event is LeftWithPellet or RightWithPellet. This is probably due to the length of the assigned character. So far, I haven't had any more breaking lines, but this issue is still not fully fixed.

matiasandina commented 9 months ago

This is not fully done. For example, this is 12 being passed as q2 and then the correct error handling and breaking of FEDWatcher

Error: Unable to convert 'Device_Number' to an integer.
Process Process-2:
Traceback (most recent call last):
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 431, in _save_to_csv
    device_number = int(float(df_data[0]['Device_Number']))
ValueError: could not convert string to float: 'q2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 249, in runHelper
    self._save_all_df()
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 479, in _save_all_df
    self._save_to_csv(df_data)
  File "/home/pi/FEDWatcher/fedwatcher/src/fedwatcher.py", line 435, in _save_to_csv
    raise ValueError(error_msg)
ValueError: Unable to convert 'Device_Number' to an integer.
matiasandina commented 9 months ago

Another example here 1w:44:54 gets parsed to 2023-12-07 01:44:54 and might affect how the pipeline works because of relying on datetime in other functions which might arrange by datetime (e.g., read_fed, recalculate_pellets).

# A tibble: 6 × 4
  `MM:DD:YYYY hh:mm:ss` Pi_Time             datetime            Pellet_Count
  <chr>                 <dttm>              <dttm>                     <dbl>
1 12/07/2023 17:43:43   2023-12-07 17:40:02 2023-12-07 17:43:43         1070
2 12/07/2023 17:44:14   2023-12-07 17:40:33 2023-12-07 17:44:14         1071
3 12/07/2023 1w:44:54   2023-12-07 17:41:14 2023-12-07 01:44:54         1072
4 12/07/2023 18:44:21   2023-12-07 18:40:40 2023-12-07 18:44:21         1073
5 12/07/2023 18:44:46   2023-12-07 18:41:06 2023-12-07 18:44:46         1074
6 12/07/2023 18:45:03   2023-12-07 18:41:23 2023-12-07 18:45:03         1075
matiasandina commented 4 months ago

There are predictable sources of "corruption" and somewhat unpredictable sources of corruption.

Predictable

durationStr seems to not handle RightWithPellet and LeftWithPellet

https://github.com/matiasandina/FEDWatcher/blob/9981d69dba178324212a1d3d678c5a07ff2bf3e8/sampleSketch/ITI_03/FED3.cpp#L918-L928

This could be changed to

char durationStr[20];
if (Event == "Pellet") {
    strcpy(durationStr, "nan");
} else if (Event == "LeftWithPellet" || Event == "Left") {
    sprintf(durationStr, "%.2f", leftInterval / 1000.0);
} else if (Event == "RightWithPellet" || Event == "Right") {
    sprintf(durationStr, "%.2f", rightInterval / 1000.0);
} else {
    strcpy(durationStr, "nan");
}

Unpredictable