database corrupt issue with file system watcher

rspython commented 1 month ago

Using the file system watch feature (great feature, thank you), when the tile server is hosted on one linux machine, the tile generation is on a different linux machine, and the disk is an azure shared storage disk between them, mounted using cifs, the watcher gives errors such as the following, when the mtbile file has been overwritten on the generating server, and requests are happening through the client. we get 500 errors to the client for a period of time, with this in the logs, then it rights itself. I would have expected it to temporarily block the requests, but maybe it's struggling because it's network mounted storage?

(note the server generating the tiles does create the files in a staging folder on local storage first, then move then into the shared folder so I'm following what the documentation says about not creating the files in the watched folder)

Jul 09 04:25:02 testserver sh[2300230]: time="2024-07-09T04:25:02Z" level=error msg="cannot fetch tile from DB for z=7, x=62, y=86 at path /services/speed-heatmap/tiles/7/62/41.pbf: sqlite.Stmt.Step: SQLITE_CORRUPT: database disk image is malformed (select tile_data from tiles where zoom_level = $z and tile_column = $x and tile_row = $y)"

brendan-ward commented 1 month ago

mbtileserver expects the entire mbtiles file (sqlite database) to be valid and intact at all times. I'm wondering if there is some latency while you are moving them to a shared folder that is causing the sqlite database to be only partially present while incoming requests are trying to fetch tiles. Depending on the OS / filesystem, I recall that the notifications about adding / updating files on the file system may not fire until after the file is completely updated, so it may not be firing those in time - and I don't know that matters in this case anyway.

For this issue, it sounds like you are updating existing tilesets and not necessarily leveraging the reload part itself, so the filesystem reload feature may be less useful: that is more about restarting the connection to the tileset and updating metadata in this case.

Since you are on network-attached storage, there probably isn't much you can do to get the entire file present at a path sooner: e.g., moving it to a staging folder on the network storage and then moving it from there to the final folder from there likely has a ton more latency than moving them on your local filesystem. Can you instead store them (in their final location) in a volume directly on the server, and then do something like: generate tiles on one server, copy to shared network file storage, copy from network file storage to volume attached to mbtileserver server?

rspython commented 1 month ago

it's only 'technically' a shared folder. as far as each linux machine is concerned it's just a mounted filesystem/folder same as any other folder on the server. and we're just overwriting existing mbtiles files with new ones when we have generated a new set of data. it's the 'underlying' cifs mechanism that makes it a 'shared folder'. however i do take your point about file system notifications etc, as they may behave differently under a cifs mount.

how about when the mbtiles library is called to read the tiles, if it returns a sql error (and if you can catch that as a corrupt db type error) you put in a little pause/retry logic (if file system watch is enabled maybe?) to say okay, maybe the file is being updated, give it a couple of seconds, block the call, and try again. if it's okay, carry on, only if it's not give the 500 error? might give it some built in resilience for that type of problem?

I'm a 20+ year java expert, so i can 'read' the go and understand the principles and logic, but i've never really done golang so i wouldn't want to try and write a pull request for what i've suggested :)

brendan-ward commented 1 month ago

mbtiles library is called to read the tiles, if it returns a sql error (and if you can catch that as a corrupt db type error) you put in a little pause/retry logic (if file system watch is enabled maybe?) to say okay, maybe the file is being updated, give it a couple of seconds, block the call, and try again

Good suggestion and I think it might be possible. I don't have time to get to this in the short term but I'll log it as in issue in the mbtiles library.

brendan-ward commented 4 weeks ago

Is it possible to do a lightweight file move on your network file system? That is, if the transfer to the filesystem is the slow part, but moves once there are fast. That would enable you to move to a staging folder on the filesystem that isn't being used by mbtileserver, and then once it is fully there, do a move to the location that is watched by mbtileserver.

consbio / mbtileserver

database corrupt issue with file system watcher #180