Open mkavulich opened 3 months ago
Just pinging this high-priority issue after more than two months, can we get someone to work on this? I can not emphasize enough that this script should not be used by anyone in its current state, nevermind recommended in the users guide as a way to solve unexpected build problems!
@MichaelLueken May I open a PR to remove the script and the related documentation? It's easy enough to remove the documentation, but I figured a developer would make a decision to remove or revise the script. However, it seems like if we remove it, anyone who has an issue with that could fairly easily restore and update it. In the meantime, it seems prudent to just remove the problem until someone is interested in doing that.
@gspetro-NOAA My concern with removing the script is that @natalie-perlin has noted that this feature was a stakeholder request. While it is most prudent to directly address this issue (opening a PR to remove the script and associated documentation), removing it to add it back in is a waste of resources.
@natalie-perlin - would you be able to clean up the devclean.sh
script, or should we go ahead and remove it for the time being?
If it's actively dangerous, I'd assume that is decidedly NOT what a stakeholder would want.
Removing it to protect user data seems like a higher priority than being able to resurrect a safer option that likely doesn't look like this script at all.
The script is only dangerous if the user explicitly points to directories outside of their local ufs-srweather-app
copy.
I can certainly agree with @mkavulich that removal of user input for BIN_DIR
, BUILD_DIR
, CONDA_DIR
, and INSTALL_DIR
and only clear content within the user's local copy will reduce the likelihood of developers accidentally deleting things. I can remove BIN_DIR
, BUILD_DIR
, CONDA_DIR
, and INSTALL_DIR
and only remove build
, conda
, exec
, include
, and share
directories through in the ufs-srweather-app
local copy. I can also check for the existence of the directories before removing them, to stay on the safe side. If this will suffice, I can include this fix in my PR #1096.
If this isn't enough, then @gspetro-NOAA, please feel free to remove the devclean.sh
script and the documentation related to it.
Let me try to address the concerns of the script, and add any checks that may mitigate the risks.
This is script is especially needed for new users who start exploring the code, and usually go through several build issues. Including more documentation on how to do it by hand is always an option, and it could be done in addition to the script, too.
@mkavulich @MichaelLueken
I'm implementing a following function remove_directory
as a part of the devclean.sh script, as shown below. It is called when a directory is expected to be removed. The function takes two arguments, $1 is a directory to be removed, and $2 is the main SRW directory. The function checks for a directory existence, whether is contains any wildcards (not removed, suggested user to use a command line for this task), and whether it is under SRW directory tree. If it is not immediately under the SRW tree or outside the tree, a user is asked for a confirmation.
It could be included in my PR-1089, or made as a separate PR.
Please feel free to comment!
# remove_directory function
remove_directory () {
local dir1=$1
local srw1=$2
# check if a directory exists
if [ -d "${dir1}" ] ; then
# check if a directory name contain wildcards
if [ "${dir1}" = "${dir1//[\[\]|.? +*]/}" ] ; then
# check + remove only if a directory is immediately under the SRW directory tree
if [ "${srw1}" = "$(dirname -- "${dir1}")" ] ; then
rm -rf "${dir1}" && printf '%s\n' "...removed ${dir1}"
else
# a directory is outside the SRW tree, request to confirm its removal
local basename1=$(basename -- ${dir1})
local dirname1=$(dirname -- ${dir1})
printf '%s\n' "You requested to remove ${dir1} that is outside SRW directory tree, please confirm below."
read -p "Remove a directory ${dir1} (Y/N)? " confirm
echo "confirm = $confirm"
case "$confirm" in
[yY][eE][sS]|[yY])
rm -rf "${dir1}" && printf '%s\n' "...removed ${dir1}"
;;
*)
printf '%s\n' "Directory ${dir1} is not removed"
;;
esac
fi
else
# a directory name contains wildcards or special characters, do not remove
printf '%s\n' "A directory name ${dir1} contains wildcards [ ] \| . ? + *, or a blank character"
printf '%s\n' "It cannot be removed in this script as a precaution. Please remove it from a command line if needed"
fi
fi
exit 0
}
Updated a #1089, added a modified devbuild.sh with safety checks https://github.com/ufs-community/ufs-srweather-app/blob/1824cec2286c028108dddee916683b99df65302a/devclean.sh
@mkavulich
Sorry for being difficult, but I don't like this solution either. Why do we need to give the users a choice for what directories to delete? If users want to specify some custom directory to delete, they can just delete it! Get rid of these options entirely, simply hard-code the directories and files that need to be deleted. No need to bother with input arguments and checking for wildcards: all of this complexity just leads to more potential failure modes.
Adding a link to comments on the issue: https://github.com/ufs-community/ufs-srweather-app/pull/1089#discussion_r1658838137
Expected behavior
Forced, recursive removal of directories via bash scripts (
rm -rf
) should be handled very carefully. Any potential mishandling of input, whether it be from unexpected behavior of wildcards, command-line typos, undefined variables, or even intentional malfeasance, can result in very destructive results when therm -rf
command is used.Current behavior
The devclean.sh has several instances of
rm -rf
:[[ -d ${BUILD_DIR} ]] && rm -rf ${BUILD_DIR} && printf '%s\n' "rm -rf ${BUILD_DIR}"
[[ -d "${dir}" ]] && rm -rfv ${dir}
for sub in ${submodules[@]}; do [[ -d "${sub}" ]] && ( rm -rf ${sub} && printf '%s\n' "rm -rf ${sub}" ); done
rm -rf ${SRW_DIR}/conda_loc
rm -rf ${CONDA_DIR}
The first and last of these include the final boss of bad UNIX practices: they run
rm -rf
on a variable from a command-line argument that is not checked for existence prior to being used. It is also not invoked in quotes, so any wildcards will pass straight through.This is extremely unsafe behavior and should be eliminated ASAP
Machines affected
All
Steps To Reproduce
DO NOT TRY TO REPRODUCE THIS. Unless you like loss of data.
Detailed Description of Fix
rm -rf
commands should be handled with the caution of a lethal weapon. When they are necessary, they must be crafted very carefully, with explicit checks on the arguments to ensure that they 1. exist, and 2. can only correspond to expected results.My preferred solution? Delete this script forever and replace it with proper documentation on starting a clean build. All the script does is remove directories, and this is a responsibility that should be handed to users, not handled opaquely by bash scripts.
At the very minimum, if we are to keep this script, it should be restricted to only removing directories within the current tree. The ability to specify custom directory locations (specifically, the
--install-dir
,--build-dir
,--bin-dir
, and--conda-dir
arguments) is unsafe and unnecessary.Additional Information
Lest you think my description of this problem is over-dramatic, here is a real-world example from a much more widely used software package: https://github.com/ValveSoftware/steam-for-linux/issues/3671