Open daltonbohning opened 4 years ago
@adammoody @dsikich
This really isn't extremely important, but rather a sort of "wish list" item to make dcp
even more robust.
I would appreciate any and all feedback on this.
Thanks, @daltonbohning . This all sounds good to me.
I generally like to declare variables close to where they are used, since it often makes the code easier to read. However, I'm fine with declaring variables at the top if there is a good reason to, and this would be a good reason.
I'm fine with either approach on the DAOS cleanup, too, so which ever way you think is best is good.
At this point, the DAOS code has been abstracted enough that we can probably put most, if not all, of the cleanup in daos_cleanup()
Probably a good idea to do this along with #391, since the code paths will likely need to be retested again anyway.
dcp has some early exits for various errors, invalid arguments, etc. Some of these are DAOS-related, and some are DAOS-agnostic. In either case, it would be good practice to always free all memory and resource that have been allocated. Essentially, anywhere
dcp.main()
has areturn
, it would be a good idea to try to cleanup any resources that were in use at the time. Some of these may be harmless, such as strings not being deallocated properly, which the OS might cleanup automatically upon exitingmain()
, but for the DAOSdfs
mounts in particular, there could be other resources in use at the time.For DAOS-agnostic code This means paths, file list, copy options, etc. The tricky part is to avoid necessarily "copy-pasting" all of the cleanup, since there are currently ~8
mfu_free
and*_delete
lines at the end ofdcp.c
, plus themfu_finalize
andMPI_Finalize
, and not all of the variables are in scope at every point of return. A possible solution could be to declare all of these variables at the top of main so they are always in scope. Then, we could either have a function that deletes all of these, or use agoto
.For DAOS-specific code This means anything in
daos_cleanup
. The tricky part here is that we might not know which parts need to be cleaned up at any given time. For example, if we fail to connect to the source pool, we will print an error and exit. But if we simply calldaos_cleanup
here, it might also try to unmount the destinationdfs
, disconnect from the destination pool, and close the destination container. But if we couldn't connect to the source pool, then all three of these things might fail, which would result in (1) extra time spent trying to free something that was never allocated (network calls) and (2) extraneous error messages for each failure, which might clutter the error output. Unfortunately, not all of these things can simply use aNULL
check. For example,daos_handle_t dst_coh
is never NULL, as the correspondingdaos_*
function expects it to not beNULL
. Possible solutions for this sort of thing could be: (1) If, for example, we fail to connect to the source pool, then set the source pool handle toNULL
. Thendaos_cleanup
can perform aNULL
check. The drawback here is that we would have to be careful with when/how we set these things toNULL
, and it should be done in a clean way. (2) We could store abool
for each variable/structure that has been successfully allocated, assuming a faileddaos_*
call means it is NOT allocated, and a successfuldaos_*
call means it WAS allocated. Then, we could check thesebools
indaos_cleanup
.I am more partial to (1), since it would potentially be a cleaner, more standard way to keep track of what is and is not allocated.