EOGrady21 / vprr

Video Plankton Recorder Data Processing
https://eogrady21.github.io/vprr/
Other
2 stars 1 forks source link

file check - deleting empty aid and aidmea files with vpr_autoid_check() #21

Closed kevinsorochan closed 2 years ago

kevinsorochan commented 3 years ago

The function, vpr_autoid_check(), deletes files that it identifies as being empty while processing file by file. This is problematic because if the 'checking' is halted by an error, files in the aid and aidmea folders are permanently modified (files identified as empty are deleted), which causes problems when debugging. Could the empty files be deleted after all of the checks have been made, rather than in the processes of checking each file sequentially?

A huge amount of effort goes into the generation of the aid and aidmea files. I think that users should be warned to back up these files in a separate directory prior to running the check so that the original version of these files is preserved.

EOGrady21 commented 2 years ago

Okay a couple points here....

  1. Was there a case where non-empty files got deleted? If so could you attempt to describe the file? (or even send an example?) So I can modify the function so it does not delete any non-empty files.
  2. The current function has a WARNING in the help doc ("WARNING: This function will delete empty aid and aidmeas files, permanently changing your directory. Consider making a back up copy before running this function."). Do you think it would be helpful to include a warning and confirmation step within the function? So when you run the function you would have to manually input 'Y' and confirm the deletion of files? Or would this be too much of a hassle?
  3. The files are deleted as a chunk from each folder so in theory there should be minimal errors 'mid-deletion'. Maybe we need a way to bypass the deletion check? The reason it is run first on each folder is that the other checks do not waste time/ throw errors on empty files. We could put in an argument that allows a user to not delete any files and instead just outputs warnings about empty files?

Let me know your thoughts @kevinsorochan - if this is easier to talk out verbally we can schedule a call and talk about this and other issues :)

kevinsorochan commented 2 years ago
  1. Was there a case where non-empty files got deleted? If so could you attempt to describe the file? (or even send an example?) So I can modify the function so it does not delete any non-empty files.

No, only empty files were deleted.

  1. The current function has a WARNING in the help doc ("WARNING: This function will delete empty aid and aidmeas files, permanently changing your directory. Consider making a back up copy before running this function."). Do you think it would be helpful to include a warning and confirmation step within the function? So when you run the function you would have to manually input 'Y' and confirm the deletion of files? Or would this be too much of a hassle?

Probably a hassle, but see my response to point 3.

  1. We could put in an argument that allows a user to not delete any files and instead just outputs warnings about empty files? It might be preferable to store the empty file names, then have an option to delete them after all checks are done.
EOGrady21 commented 2 years ago

Ok so we could provide an option to store empty file names rather than have the function automatically delete them. I would rather avoid another function which requires an input during processing, CRAN really doesn't like that type of function. I'll work up the code. I think the best option would be to output an object with a list of empty file names, then the user can run their own checks/ delete the files through R or manually.

kevinsorochan commented 2 years ago

Ok seems like a good soon

On Tue, Feb 22, 2022 at 9:58 AM Emily Chisholm @.***> wrote:

Ok so we could provide an option to store empty file names rather than have the function automatically delete them. I would rather avoid another function which requires an input during processing, CRAN really doesn't like that type of function. I'll work up the code. I think the best option would be to output an object with a list of empty file names, then the user can run their own checks/ delete the files through R or manually.

— Reply to this email directly, view it on GitHub https://github.com/Echisholm21/vprr/issues/21#issuecomment-1047823313, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYWLV422ELMDGTADJCEXELU4OI6XANCNFSM4V5XM7BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Kevin Sorochan PhD Aquatic Biologist Bedford Institute of Oceanography Fisheries and Oceans Canada

EOGrady21 commented 2 years ago

I had forgotten that this function already outputs a data log so the empty files are listed in the log output if del argument is set to FALSE. A little bit annoying to then manually delete files (you won't have an object with names listed, would have to either find them in file explorer one by one or make some text grep function to pull it from the log and then delete through R...) but allows for a more thorough check if the user requires.