SMI / dicompixelanon

DICOM Pixel Anonymisation
3 stars 0 forks source link

dicomrectdb - sqlite crash with EIO during fsync() #17

Closed howff closed 1 year ago

howff commented 1 year ago

If you get the error sqlite3.OperationalError: disk I/O error it could be caused by

fsync(7) = -1 EIO (Input/Output error)

where file descriptor 7 was opened as

open("/nfs/smi/home/username/dbdir/dcmaudit.sqlite.db", O_RDWR|O_CREAT|O_CLOEXEC, 0644) = 7

Note that this is on a NFS filesystem. Is the fault caused by the NFS client, the NFS server, the old version of sqlite used on CentOS-7.66.1810, or a bug in our code? It seems to work ok when using a local /tmp/directory.

howff commented 1 year ago

Upgrading sqlite did not appear to make any difference.

Using beegfs or a local filesystem was successful (didn't try nimble).

Conclusion: for best results don't use the `/nfs mount. (I've used it successfully in the past so I don't know why it's started misbehaving now, but that's the only conclusion we can draw right now).

howff commented 1 year ago

Actually this seems to be a problem only on nsh-smi04 as it's a general problem, not just with this function, and there's lots of nfs4 lock error messages in syslog.