onnela-lab / beiwe-android

Beiwe is a smartphone-based digital phenotyping research platform. This is the Beiwe Android app code. The Beiwe2 app is also available on the Google Play store to use with open source builds of the Beiwe backend.
https://www.beiwe.org/
BSD 3-Clause "New" or "Revised" License
27 stars 16 forks source link

write failed: ENOSPC (No space left on device) #32

Closed zagorsky closed 6 years ago

zagorsky commented 6 years ago

I don't think there's any easy fix for this. https://sentry.io/onnela-lab/production/issues/461716189/ https://sentry.io/onnela-lab/production/issues/461936990/ https://sentry.io/onnela-lab/production/issues/456211258/ https://sentry.io/onnela-lab/production/issues/483309548/ https://sentry.io/onnela-lab/production/issues/498048356/

Example stack trace:

Android Error: write failed: ENOSPC (No space left on device)
Error message: write failed: ENOSPC (No space left on device)
Error type: class java.io.IOException
Error-fill:
    org.beiwe.app.CrashHandler.writeCrashlog(CrashHandler.java:77)
    org.beiwe.app.storage.TextFileManager.writePlaintext(TextFileManager.java:242)
    org.beiwe.app.storage.TextFileManager.writeEncrypted(TextFileManager.java:254)
    org.beiwe.app.listeners.AccelerometerListener.onSensorChanged(AccelerometerListener.java:79)
    android.hardware.SystemSensorManager$SensorEventQueue.dispatchSensorEvent(SystemSensorManager.java:851)
    android.os.MessageQueue.nativePollOnce(Native Method)
    android.os.MessageQueue.next(MessageQueue.java:323)
    android.os.Looper.loop(Looper.java:136)
    android.app.ActivityThread.main(ActivityThread.java:6816)
    java.lang.reflect.Method.invoke(Native Method)
    com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1563)
    com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1451)
Actual Error:
    libcore.io.Posix.writeBytes(Native Method)
    libcore.io.Posix.write(Posix.java:273)
    libcore.io.BlockGuardOs.write(BlockGuardOs.java:319)
    libcore.io.IoBridge.write(IoBridge.java:496)
    java.io.FileOutputStream.write(FileOutputStream.java:316)
    java.io.FileOutputStream.write(FileOutputStream.java:296)
    org.beiwe.app.storage.TextFileManager.writePlaintext(TextFileManager.java:231)
    org.beiwe.app.storage.TextFileManager.writeEncrypted(TextFileManager.java:254)
    org.beiwe.app.listeners.AccelerometerListener.onSensorChanged(AccelerometerListener.java:79)
    android.hardware.SystemSensorManager$SensorEventQueue.dispatchSensorEvent(SystemSensorManager.java:851)
    android.os.MessageQueue.nativePollOnce(Native Method)
    android.os.MessageQueue.next(MessageQueue.java:323)
    android.os.Looper.loop(Looper.java:136)
    android.app.ActivityThread.main(ActivityThread.java:6816)
    java.lang.reflect.Method.invoke(Native Method)
    com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1563)
    com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1451)
biblicabeebli commented 6 years ago

So we CAN catch this error, but the whole file writing stack makes guarantees about providing the encryption key in the file, so "handling" this failure mode is hugely problematic.

However, the statement above does provide an avenue of investigation worth tracking down because we might be silently failing to provide the encryption key in some very rare scenarios.

zagorsky commented 6 years ago

@sesterki can you please try to figure out what the failure mode is here?

We've seen at least one instance where we got an ENOSPC from a specific patient ID, and then we got OurBase64Error: Incorrect padding on beiwe-backend with files uploaded from that same patient ID.

So when writes fail, does it write a partial line, instead of a full line? And is there a way we can prevent that, i.e., make it either write the full line or write nothing at all?

sesterki commented 6 years ago

Error handling in commit e092597b78042d0d54885c9cb778410daeb3c01e

zagorsky commented 6 years ago

This error should happen differently on app versions greater than 2.3.3. It should still throw Sentry errors, but it shouldn't create corrupted files. Instead, it should keep trying to create a new file until it finally succeeds.