Throw an error if the dataset_name already exists before starting the preprocessing step. This will prevent accidental overwrites or conflicts.
Ensure that the dataset_name is saved only if the preprocessing step completes successfully. This will prevent partial or failed datasets from being saved.
Fix the code to process images located within subdirectories of the dataset directory. Currently, only files in the root directory are processed.
Implement a validation to check whether the input file is a valid image file before processing. This ensures only image files are processed, avoiding errors with invalid file types.
Check whether the input directory is empty before proceeding with preprocessing. If the directory is empty, return an appropriate error message.
Ensure that the text query input is neither empty nor None. If it is, throw an error or prompt the user to provide valid input.
Update the code to automatically trim leading and trailing spaces from the dataset_name input provided by the user. This ensures consistency and prevents errors caused by unintended spaces in dataset names during processing or database operations.