Closed patrickfreed closed 4 years ago
Are you sure that's the case? The log and db files seem to stick around for me after I shut the cluster down. When I start up a new cluster the old ones are deleted, however.
Yes the order is:
When I run mongo-orchestration stop
, the log files are gone after the daemon exits.
Hm, something must be messed up on my end. Well, given that, I don't think we can merge this patch, since it would seemingly break everyone's log storage. I think as you suggest we'll have to take an approach outside of mongo-orchestration.
I originally planned on doing this by adding a separate script to drivers-evergreen-tools that issues a `curl DELETE; doing that wouldn't cause mo to delete the log/data files too, right?
Also, what do you think about like a --preserveMongoData
option that would toggle that cleanup behavior?
DELETE would also remove the log files. MO deletes the files anytime it shuts down a cluster:
head /var/folders/lm/b1r2f8p503xg40r6x2rqv7fr0000gp/T/mongo-n3d4Jr/mongod.log
2020-01-21T14:36:25.702-0800 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2020-01-21T14:36:25.703-0800 W ASIO [main] No TransportLayer configured during NetworkInterface startup
2020-01-21T14:36:25.708-0800 W ASIO [main] No TransportLayer configured during NetworkInterface startup
2020-01-21T14:36:25.709-0800 D1 NETWORK [main] fd limit hard:524288 soft:262144 max conn: 209715
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] MongoDB starting : pid=35243 port=27017 dbpath=/var/folders/lm/b1r2f8p503xg40r6x2rqv7fr0000gp/T/mongo-n3d4Jr 64-bit host=Shanes-MacBook-Pro-2.local
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] db version v4.3.2-594-g24a8f59
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] git version: 24a8f59e21490f4dd101224ba01872d773c7f25e
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] allocator: system
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] modules: enterprise
2020-01-21T14:36:25.709-0800 I CONTROL [initandlisten] build environment:
$ curl -XDELETE http://localhost:8889/servers/e82cbabf-e6e8-406a-be80-2e8ee9c0fcac
$ head /var/folders/lm/b1r2f8p503xg40r6x2rqv7fr0000gp/T/mongo-n3d4Jr/mongod.log
head: /var/folders/lm/b1r2f8p503xg40r6x2rqv7fr0000gp/T/mongo-n3d4Jr/mongod.log: No such file or directory
I think a simpler solution might be to copy the files you're interested in before uploading them. Would that avoid the tar failures you're seeing?
Well, given that, I don't think we can merge this patch, since it would seemingly break everyone's log storage
It wouldn't break Python's log upload task because we upload the logs before shutting down mongo-orchestration:
- func: "upload mo artifacts"
- func: "upload test results"
- func: "stop mongo-orchestration"
@ShaneHarvey I think the whole problem is that those files might still be written to, and tar
would error out because bytes were changing while it was trying to do its job. I think its reasonable that we try to find a more comprehensive solution to this problem.
Here's a patch confirming what @mbroadst said. I think most drivers run "upload-mo-artifacts" in post
(which suppresses failures), so they don't see these errors. I was hoping that the logs being append-only while upload-mo-artifacts was being run would prevent the issue, but that appears to not be the case.
Yeah I think it's fine to not cleanup the files after shutting down the clusters in the cleanup_storage signal handler.
So I discovered why the logs were apparently not being deleted on my machine:
In the code snippet you pasted above (cleanup_mprocess
), it attempts to delete cfg["logPath"]
, but the log path is actually stored at cfg["logpath"]
, not in camel case (see here). By default, the log files are written to dbpath/mongod.log
, so if you don't specify a log path it does usually get deleted. The configurations I use specify log paths, so the logs don't get deleted.
How about we make this behavior official and remove the attempted log path deletion from cleanup? The nice thing is since this won't have any behavioral changes (this bug has been in there for 7+ years), we know we won't break any evergreen configs. Then I won't need to modify anything else and can just use the blocking stop changes already made thus far in combination with a custom log path.
SGTM. Can you open a follow-up issue to not delete the default logpath (dbpath/mongod.log
) on shutdown too?
One note. I've noticed that shutting down clusters can take a long time, especially with newer server versions. If that proves to be the case in Evergreen, we may need to optimize stop
to shutdown the clusters forcefully with SIGTERM instead of using the shutdown
command.
I wasn't able to run the entire test suite locally due to similar issues found in #273, but I did run drivers-evergreen-tools and the swift driver's evergreen's with this patch and they passed fine. As you pointed out, the tasks that use latest sharded clusters can take up to 30s longer now that it waits for shutdown, but in my experience that is usually only a small percentage of the total time a task takes to run. Do you think that's sufficient testing or should I try some more drivers out?
Yes I think that's fine. I opened https://github.com/10gen/mongo-orchestration/issues/278 to make mongo-orchestration stop
faster.
It looks like I don't have write access to this repository, so could you squash/merge it for me?
Right now,
mongo-orchestration stop
on non-Windows platforms simply sends a signal to the mongo-orchestration daemon and exits immediately. In the background, the mongo child processes and the daemon itself will shut down after a brief period.This PR updates the "stop" command to wait until the daemon has fully shut down before returning, effectively causing
mongo-orchestration stop
to block until both the daemon and the cluster are no longer running.As part of this change, I added a dependency on
psutil
to facilitate polling whether the process is still alive or not. If adding a dependency is considered costly, that functionality could be reimplemented in mongo-orchestration. The library does seem like it could be useful in other areas of the codebase, however.I was having trouble running the full test suite (with or without my changes) due to
test_replica_sets.ReplicaSetAuthTestCase
hanging, so unfortunately I do not know if this patch passes completely. I did test it manually and it seems to work, though.