tjohansen14 / diskscrub

Automatically exported from code.google.com/p/diskscrub
GNU General Public License v2.0
0 stars 0 forks source link

scrub: MultiFile processing and enhancements: not a bug #30

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Proposed version 2.6.0

See NEWS file.
* Handle Multiple Devices/Files from Command Line including wildcards.
  [Peter Hyman]

* Code Cleanup in scrub.c. [Peter Hyman]

* Update -D (wipe directory entry and rename) processing to avoid name
  collisions when scrubbing multiple files. [Peter Hyman]

* Fix pat2str and strtomem functions in pattern.c.
  pat2str would automatically truncate pattern to 4 bytes maximum. This
  would impact display of custom string > 4 bytes.
  strtomem incorrectly handles Octal numbers when Octal value > \377
  255 decimal. These would require 2 bytes and an additional conversion
  step is required. [Peter Hyman]

* Improve error handling to avoid most hard exits with multiple files
  when parsing command line. [Peter Hyman]

* Add DEBUG hooks. Compile with CFLAGS="-DDEBUG ..." and all scrub code
  will be skipped. Useful for logic checking. DEBUG output will be the
  command and variables passed to scrub function. [Peter Hyman]

* Manpage update [Peter Hyman]

Unit Test suite also included (does not test devices. those must be done 
manually) Log and Error log output included.

Original issue reported on code.google.com by pete4...@gmail.com on 12 Aug 2014 at 2:35

Attachments:

GoogleCodeExporter commented 8 years ago
Peter,

Thanks for this!

It would be good if you could rework these patches with the following changes

- integrate unit tests into the test/ subdir (using the same style)
- eliminate #ifdef DEBUG code
- avoid changing anything unrelated to the specific functionality you are 
adding; in other words, minimize the size of the patch to the bare minimum.  
Reorganizing, or re-spacing code makes it hard to see the substantive changes 
vs the no-ops.
- make sure you are using the same style as the code around you (I think you 
mostly are)
- is it possible to move the inner part of your loop to another function? main 
is getting long.
- minimize man page changes for this - I would just add a [file ...] to the 
usage at the top (I guess for special files too?) and that's it.  btw, you 
don't have to call out shell expansion or wildcards.

Jim

Original comment by garlick....@gmail.com on 16 Aug 2014 at 9:54

GoogleCodeExporter commented 8 years ago
scrub.c re-edited. I undid all the cosmetic and readability improvements (I 
disagree with this. It has to be done. If not now, someday). Removed Debug. 
Reordered variable declarations in main so as not diff existing. re-added 
spurious \ in GPL block. DEBUG removed.

Original comment by pete4...@gmail.com on 17 Aug 2014 at 1:49

Attachments:

GoogleCodeExporter commented 8 years ago
Hey, that's better but there's still multiple items in there that should be
split into multiple commits.  Are you using git for your development?  git 
rebase makes editing patch sets so easy.

I've got a 'multifile' branch going on my laptop which I'll push to google code 
after I get done with some family stuff.  It moves the code you're enclosing in 
a loop to a separate function and hopefully will show you what I mean about 
breaking up commits.  Maybe if you can rebase your work on top of that we'll 
end up with a concise set of changes that will be easier to scrutinize.

I don't like the -D handling in here and I would advocate for deferring that 
for a later patch.  Will push changes in a couple hours

Original comment by garlick....@gmail.com on 17 Aug 2014 at 5:46

GoogleCodeExporter commented 8 years ago
scrub.c was modified only to handle multiple files. There is nothing to split 
up. You can't patch one section without the others. I do not want to move the 
loop code to another function since variable would have to either be declaired 
outside main or passed by address adding risk and additional test time. I 
pulled git but did not do anything other than patch the code. I did not want my 
own branch, nor did I feel necessary since this patch was quite specific. -D 
handling has always been deficient, but should be left out of this batch of 
code. The parts for it I inserted were necessary to avoid name collisions when 
files are not deleted.

Original comment by pete4...@gmail.com on 17 Aug 2014 at 7:09

GoogleCodeExporter commented 8 years ago
Peter, rather than make you redo your patch when you seem reluctant, I went 
ahead and reworked it myself and the results are on the branch called 
'multifile'.
Could you review my changes and see if they look ok to you?

The change set on that branch cover the scrub mainline and the man page.
Unit tests still need to be integrated into the 'make check' infrastructure.
I'll leave that to you if you're up for it.

I did factor out the inner part of that loop, and I try to find any problem
with the list of files before beginning any scrubbing.  The patches are split
into the sort of incremental steps I was asking for.

Original comment by garlick....@gmail.com on 17 Aug 2014 at 10:35

GoogleCodeExporter commented 8 years ago
The git runes to have a look would be

git fetch origin
git checkout multifile
git log # list commits
git show xyzq  # show commit as patch (xyzq are first 4 chars of SHA1)
git format-patch 022f   # creates patch series for this change set (as follows)

0001-issue-28-handle-incorrect-octal-escapes.patch
0002-issue-27-dont-truncate-custom-pattern-in-display.patch
0003-multifile-factor-scrub_object-out-of-main.patch
0004-multifile-allow-more-than-one-file-argument.patch
0005-multifile-put-command-opts-in-a-struct.patch
0006-multifile-sanity-check-targets-before-scrubbing.patch
0007-multifile-update-man-page.patch

Original comment by garlick....@gmail.com on 17 Aug 2014 at 10:44

GoogleCodeExporter commented 8 years ago
Jim. You reverted to hard exits in the scrub_object function. This negates the 
benefits of multi-file. Errors are not counted and I believe -D can and should 
be used with more than one file. I agree with the structure setup, but the rest 
I am having a hard time with. A lot of my sanity checking was removed. I need 
much time to study this more and test.

Original comment by pete4...@gmail.com on 18 Aug 2014 at 12:59

GoogleCodeExporter commented 8 years ago
Would a --error-continue flag or something like that address your worries about 
hard exit?

This would only apply to I/O errors and such that occur during scrubbing.
Note the two passes - first pass scans for "sanity" type errors and reports 
them all.
Second pass is only executed if first pass finds no errors.  This is where 
scrubbing occurs.

Original comment by garlick....@gmail.com on 18 Aug 2014 at 2:08

GoogleCodeExporter commented 8 years ago
Propose two changes.
1: change short option for --dry-run from -d to -n for patch compatibility
2: emit custom string at program start

I could not push.

Original comment by pete4...@gmail.com on 19 Aug 2014 at 2:36

Attachments:

GoogleCodeExporter commented 8 years ago
Stupid omission in OPTIONS list and main switch block

Original comment by pete4...@gmail.com on 19 Aug 2014 at 3:17

Attachments:

GoogleCodeExporter commented 8 years ago
Sounds good, adding that.

Original comment by garlick....@gmail.com on 19 Aug 2014 at 4:57

GoogleCodeExporter commented 8 years ago
Update to manpage to account for Octal handling >\377.

Original comment by pete4...@gmail.com on 19 Aug 2014 at 6:10

Attachments:

GoogleCodeExporter commented 8 years ago
Seems like a slippery slope, what about \#ffff or \#gh?
I did change the custom long description to:
  custom="str" 16 chr max, C esc like \r, \xFF, \377, \\
and (per your other change) we now show the whole custom seq in the output.

Original comment by garlick....@gmail.com on 19 Aug 2014 at 6:19

GoogleCodeExporter commented 8 years ago
\# is not handled or parsed. If you recall way back, I devised a different 
custom method. -a for ascii and -x for hex. This would restrict input based on 
either ascii (which could include escaped chars) or hex only. Adding Octal was 
cool, but unneeded since it is converted to hex anyway.

Come to think of it, the amount of time we're spending on it probably augurs 
for its removal.

Original comment by pete4...@gmail.com on 19 Aug 2014 at 6:25

GoogleCodeExporter commented 8 years ago
small suggestions on freespace and single file scrubbing.
-X and -D should not be fatal when used together.
eliminate single file scrub handling block.

Original comment by pete4...@gmail.com on 20 Aug 2014 at 2:01

Attachments:

GoogleCodeExporter commented 8 years ago
I think it's preferable to make invalid arguments fatal.
If we go scrubbing, we may be doing something the user didn't intend
because they got the command line wrong.  There's a high potential
for unintended consequences with this utility so fatal seems safer to me.

Regarding the other bit, I think we've already been over it.
Your change makes the code a tiny bit neater but has us checking
for errors twice for no gain.  Let's leave it as is.

Original comment by garlick....@gmail.com on 20 Aug 2014 at 11:07

GoogleCodeExporter commented 8 years ago
Has us checking for errors twice in the single target case I meant to say

Original comment by garlick....@gmail.com on 20 Aug 2014 at 11:07

GoogleCodeExporter commented 8 years ago
feature implemented in 2.6.0

Original comment by garlick....@gmail.com on 22 Aug 2014 at 4:08