jmicrobe / odin

Repo for documentation related to the Optical Density Instrument at the Stahl lab
MIT License
1 stars 2 forks source link

ODIn froze at "Done." #33

Closed jmicrobe closed 5 years ago

jmicrobe commented 6 years ago

I've been running the ODIn since yesterday, in my office (so without having the shaker plugged in) with blank tubes in all of the channels (to debug after replacing some racks). It was looking fine when I left (server was up, filemaker was getting the data, etc.) but this morning it's displaying "Done." with a purple screen, and hasn't changed for the last 15 minutes.

Looking at the server, the last readings were at 10/23/18 01:07:51 and the readings don't show any unusual values. The last readings were consistent as well (10/23/18 00:57:51, 10/23/18 00:47:51, 10/23/18 00:37:51).

@jsebrof any insight into what caused this? Since I'm not running an experiment I can leave it as-is. As I mentioned previously, the only line in the code that matches "Done." (capital D) is line 589, which I think relates to writing to the SD card.

dacb commented 6 years ago

One thing that might be interesting to note is if that last sample was in fact written to the SD card. Each call to write_data_to_SD_card is followed by a call to send the data over the wifi interface to the server: https://github.com/jmicrobe/odin/blob/master/System_2.ino#L456 https://github.com/jmicrobe/odin/blob/master/System_2.ino#L357

Interestingly, the delay call, presumably to allow the SD card to sync, is not consistently before or after the wifi packet send.

jmicrobe commented 6 years ago

Plugged into the arduino serial monitor and now it's showing "setup: configuring ethernet via DHCP". https://github.com/jmicrobe/odin/blob/d2790485fde064697ce6b4cba3fdd405781600d7/System_2.ino#L156

jmicrobe commented 6 years ago

@dacb I just checked the SD card and the last reading matches the data for the last reading sent to the server (though the timestamp on the server is exactly 10 minutes later).

dacb commented 6 years ago

This looks like joining the network is failing. It would be nice if that message appeared on the unit's display without the serial monitor.

jsebrof commented 6 years ago

@jmicrobe Both the LCD output and Serial output you mention point to there being a network issue. Interestingly, if the system is freezing at "setup: configuring ethernet via DHCP" then I would question whether the Arduino Ethernet Shield has broken, since the next instruction is only setting the MAC Address. Regarding your most recent comment about the differences between the Server and SD Card data. Bear in mind that while the system writes its Real-Time-Clock datetime value into the SD Card record, it does not do so when sending the Ethernet data to the Server. Instead, the Server applies its own datetime value when writing the received data into the database. So it could very well be a coincidence that there is a 10 minute time difference between the system RTC and the Server, and that experiment was setup for 10 minute intervals. The data being the same between the two entries I think gives the right indication, assuming enough uniqueness in the measured values. Otherwise I am forced to think that the most recent data sample made it to the Server, but didn't get written onto the SD Card, which doesn't make logical sense, given the location in the program where the System froze.

@dacb Far too much time has passed for me to be able to remember if there was any reason for the SD Card writing and Ethernet packet sending in the calibration stage to be handled any differently from the normal loop stage. However, I cannot at present think of any reason not to make them consistent.

Pull request #35 contains changes that will add LCD output for Ethernet packet sending. It also makes the calibration stage and normal loop stage consistent with regard to SD Card writing and Ethernet packet sending, giving both operations a full second to complete following their function calls.

I am not particularly confident that these changes will solve this issue, though it would certainly be convenient if they did. If there is a spare Ethernet Shield hanging around, try swapping that one in and see if that helps. Unfortunately, due to the lack of electrical isolation in the System, any of the many electrical incidents this particular ODIn setup has experienced in the past could have damaged equipment, causing intermittent issues to be experienced.

jmicrobe commented 6 years ago

I loaded the new script on the ODIn and it worked - I saw the new LCD message but it flashed pretty quick. If there's another freeze maybe it'll be helpful to see where it gets stuck. I'll merge pull request #35 after I run the ODIn for a while.

We had two ethernet shields in the lab but they've been opened and it's not clear what state they're in. I'm actually about to be out of the office until November 5th - so instead of testing them I left the current one installed. I did put in an order request for a replacement, this one on sparkfun. When I get back I'll see about swapping it out.

I'm running the ODIn while I leave in order to monitor cultures (but not for experimental data collection). @hunt0362 will be here to check in, in case of another freeze and reset the machine if necessary. With the last run I found that as long as you swap out growing culture tubes with water blank tubes before you reset the ODIn, you can still monitor the values through the raw data file accurately.