datalogistics / ibp_server

3 stars 3 forks source link

Update init script so the RIDs unmount cleanly. #20

Closed disprosium8 closed 8 years ago

disprosium8 commented 9 years ago

This fix should prevent the DBs from being rebuilt on start/stop and restart.

PerilousApricot commented 9 years ago

Hold off on this a sec, I think rebuilding the DB is The Right Thing To Do.

disprosium8 commented 9 years ago

That would be odd, because what's the point of this then?

https://github.com/datalogistics/ibp_server/blob/master/ibp_server.c#L486

There are no handlers for SIGTERM or SIGKILL, and I think that has the potential to leave the DB in a bad state.

PerilousApricot commented 9 years ago

Unfortunately, past experience has shown the DB to be in a bad state even under the best of circumstances. The depot itself now stores per-allocation metadata in the first handful of KB, rendering the external DB vestigial. Since the canonical information about the allocations is within the allocation itself, it's safer to index from there instead of using the external DB. Once I get some cycles, it'll get culled out at the same time I make another pass at the LevelDB implementation. There is interest in using LStore to store many many many small (<4kB) files, so those two things is what will get us there.

RE the shutdown code: The database and cleanly shutting down are orthogonal to each other. You still want to give the depot a chance to flush caches, close client connections, etc.. regardless of if you keep the DB or not. (though I guess you could have some philosophical argument about how much effort does the IBP best-effort entail).

@tacketar - Am I roughly correct?

PerilousApricot commented 9 years ago

Sorry, I completely whiffed on this -- you're enabling it to cleanly shutdown, which is good.

If you do this, I suggest removing the database on start. That way the depot won't re-use it, and will walk the allocations. I'm 95% sure that's what we do @ Vandy.

disprosium8 commented 9 years ago

Ok - that makes sense, and it sounds like we do want to use the shutdown handler, but we still don't have guarantees that the DB will be consistent. So the solution for now is to force the rebuild.

This came up because both PR and our group noticed it could take on the order of an hour to rebuild on restart if there the depot was already loaded with 100s of GBs of data. So not the most admin friendly thing, and difficult to explain away in a set of install instructions.

I'd like to have the DLT install eventually use leveldb exclusively and avoid the BDB nonsense but I haven't had the time to integrate and test that config. Are there some HOWTO docs on getting leveldb up and running with ibp_server?

PerilousApricot commented 9 years ago

I'll have to defer to the others on how long it should take to come back up. If it takes you O(1 hour) to reload O(100GB), and our depots store O(10TB), that would imply our depots take O(100 hours) to come up, which seems preposterous to me.

RE LevelDB: It "works" in that the code compiles and some real basic tests using a LevelDB RID with LStore could bring data in and out, but I wouldn't consider it for production without some more extensive testing. For testing purposes, you can set the "type" for mkfs_resource to "leveldb" and that RID will use the leveldb backend.

Unfortunately that doesn't quite dodge the BDB stuff. The BDB stuff is on top of the osd layer, so regardless of the backend, it's still going to try and keep it around. Which is obviously silly because the osd layer itself is a database. The leveldb and BDB stuff are coupled in my mind because fixing one means deleting the other, if that makes sense.

PerilousApricot commented 9 years ago

Hunted it down. Our depots with 36x4TB drives come back up in ~30 minutes or so (the rebuilding runs in parallel per-RID, btw). If your depots with (presumably) smaller drives/less data are taking longer than ours, something is strange. How many allocations do you have per-RID? We've got ~15k per drive or so.

PerilousApricot commented 8 years ago

@disprosium8 Any word on this?

disprosium8 commented 8 years ago

I haven't had a chance to look into this further, I'm on travel this week. We can start/stop a depot on one of the IU servers soon to see how long it's actually taking for us. I just remember it seemed like too long to wait...

On 10/26/2015 01:50 PM, Andrew Melo wrote:

@disprosium8 https://github.com/disprosium8 Any word on this?

— Reply to this email directly or view it on GitHub https://github.com/datalogistics/ibp_server/pull/20#issuecomment-151224951.

PerilousApricot commented 8 years ago

Cool. Let me know when you get a chance to look at it. i'm curious to see if we can narrow down the difference between the two.

At the very least, if the restart takes too long and there's not an easy fix, maybe adding a progress meter if there's a TTY attached will at least give admins a clue of how long they can expect to wait...

disprosium8 commented 8 years ago

Getting back to this, I just restarted one of our depots with ~300G of allocations. It took about 25 minutes to become reachable again. A kill -SIGQUIT caused other another error at shutdown...

# bin/ibp_server etc/ibp.cfg
Config file: etc/ibp.cfg

unable to join the environment
RID 0 not cleanly unmounted!  Forcing a rebuild!
*** Error in `bin/ibp_server': munmap_chunk(): invalid pointer: 0x000000000198a930 ***
Aborted

At some point this needs to be audited and a better solution found, but I suppose for now sending SIGKILL is fine as it seems rebuilding is the "safe" way to go.

PerilousApricot commented 8 years ago

I suspect that might be a permissions problem with trying to load up BDB: unable to join the environment. It's on the list of "things to strip out".

tacketar commented 8 years ago

Yes. BDB is crap and is on the list to cull. I used to constantly upgrade BDB to the latest hoping it would fix things. It never did.

Alan

On 12/9/2015 12:25 PM, Andrew Melo wrote:

I suspect that might be a permissions problem with trying to load up BDB: |unable to join the environment|. It's on the list of "things to strip out".

— Reply to this email directly or view it on GitHub https://github.com/datalogistics/ibp_server/pull/20#issuecomment-163349083.

disprosium8 commented 8 years ago

Leaving proper shutdown of DB as-is until BDB can be culled.