tjscience / RoboSharp

RoboSharp is a .NET wrapper for the awesome Robocopy windows application.
MIT License
216 stars 67 forks source link

Progress Estimator logic needs fixing for handling of switches such as /IS #148

Closed PCAssistSoftware closed 2 years ago

PCAssistSoftware commented 2 years ago

Initially discussed here - https://github.com/tjscience/RoboSharp/pull/147#issuecomment-1030865157

Key points of discussion copied below - will start working on PR this week.


I think changes / improvements may be needed to log parsing for estimator as just been reading some RoboCopy documentation and doing some tests with switches such as /IS (include same)

Log shows same (all in lower case) normally, but if using /IS then the S is capitalized (Same) - we only handle it correctly if lowercase (according to what I see in RoboSharpConfiguration.cs) - so this affects values for copied / skipped.

Uppercase S means copied (e.g. using /IS), lowercase s means skipped (default behaviour)

N.B. same logic possibly applies to other switches such as /XO (exclude older) and /XN (exclude newer) as we only handle Newer / Older and not newer / older.

Still looking at /IT (include tweaked), /XC (exclude changed) as these can return Tweaked / tweaked or Changed / changed but not sure how (or if) we handle these currently - but certainly dependent on switches and upper/lower case first letter would make difference to results being correctly counted as copied or skipped.

Really handy reference to all the tags here - https://theether.net/download/Microsoft/Utilities/robocopy.pdf - is for older version of RoboCopy but just tested and all still relevant to current version - see page 30.

Also shows lots of tags I never even knew about which again we don't handle but could affect results shown e.g. large, small, too new, too old, attrib when using switches such as /MIN, /MAX, /MINAGE, /MAXAGE, /XA

Each line in the output log begins with a brief text tag, which is formatted according to the following rules:

· All capital letters indicate an anomaly that should be investigated. · Initial capital letters indicate a file that was selected for copying. · All lowercase letters indicate a file that was skipped (displayed only if the /V switch is used)

PCAssistSoftware commented 2 years ago

Response from @RFBomb

This should only Affect ListOnly operations, as OnCopyProgressChanged lets PE know if a file has started/finished copying regardless of what it identified as for XO/XN/Same/etc.

I've provided the base logic for it, if you or someone else wants to expand on that pattern to include all the possible tags (which I think is overkill unless we also include the robocopy language files to identify the tags), the. That's fine. But expanding past what I've currently provided for that is something I'm not interested in doing, since it will primarily affect only ListOnly operations.

Plus, it may already be handled:

if (currentFile.FileClass.Equals(Config.LogParsing_NewerFile, StringComparison.CurrentCultureIgnoreCase))
                    {
                        if (command.SelectionOptions.ExcludeNewer)
                            PerformByteCalc(currentFile, WhereToAdd.Skipped);
                        else
                            PerformByteCalc(currentFile, WhereToAdd.Copied);

Response from @PCAssistSoftware

From the tests I have done it doesn't just affect ListOnly - see figures below comparing with /IS and without /IS - results are fine, just progress estimator which is wrong

image

Also a more immediate issue when using /IS for some reason in the OnCopyProgressChanged event e.CurrentFile.Name is nothing despite file obviously being copied as shows CurrentFileProgress

image

Response from @RFBomb

Ok then update PE AddFile method to make it work.

Response from @PCAssistSoftware

Okay will look this week and see what I can do, may as well try and cover all bases, no point offering the switches in GUI if it then breaks parts the estimator.

Any comment on second issue in last post as that seems a more serious problem potentially?

Response from @RFBomb

That's because in PE it's being set to null because the check for 'same' doesn't follow the same pattern as XN and XO. If you look at the file, the issue should be pretty easy to see and update to a similar pattern

(Fixing one should fix both - same root cause)

Response from @PCAssistSoftware

That's because in PE it's being set to null because the check for 'same' doesn't follow the same pattern as XN and XO. If you look at the file, the issue should be pretty easy to see and update to a similar pattern

Okay will spend some time on it this week, may come back for some advice if I get stuck, but more than happy to have a go

(Fixing one should fix both - same root cause)

That's what I thought - good

RFBomb commented 2 years ago

I believe all these checks should fall under the if (!CopyOperation) logic.

If not a 'new file', it should assume it will skip the file. OnCopyProgressChanged will call either CopyStarted or AddFileCopied, if the file starts/finishes copying, correcting the 'skip assumption' before the count is calculated.

This means that all these upper/lower case flags should only really evaluated when doing a ListOnly operation, since otherwise the logic noted above regarding the assumption should catch everything as far as skip/copied is concerned.

Currently, it evaluates the tokens by ignoring case sensitivity, then doing a secondary check on the actual setting to determine copied or not.

PCAssistSoftware commented 2 years ago

@RFBomb

Initially working on Same / IS - as need to do further testing for Newer / Older etc because as you say that may already be covered.

Changes made in #149 seem to work for CopyOperation and also solves the issue with e.CurrentFile.Name being nothing but then breaks progess estimator figures for !CopyOperation (ListOnly) so I am obviously going wrong

P.S. Never done a PR myself so bear with me here, but really want to learn how to fix this myself :) with some gentle nudges along the way as and when you have the time

PCAssistSoftware commented 2 years ago

Tried various alternatives to current PR but still no joy, it works for copy and fixes e.CurrentFile.Name being nothing, but ListOnly figures then 1 file out in its calculations and I cannot see why

I think Same is confusing more than any other of the potential changes becuase it is currently handled separately (//Identical Files) and also I assume needs to be in else if (!CopyOperation) section - even tried removing first entry and it works but is then as I say 1 file out in calculations and not obvious why as log output looks the same

Will continue to play, but any tips appreciated (but only when you have time)

EDIT: Removed all changes relating to IncludeSame (/IS) for now and concentrated on below changes

PCAssistSoftware commented 2 years ago

I can confirm SelectionOptions.ExcludeNewer (/XN) and SelectionOptions.ExcludeOlder (/XO) are both already handled correctly - all results in estimator tested and are correct

PCAssistSoftware commented 2 years ago

SelectionOptions.ExcludeAttributes (/XA) and SelectionOptions.IncludeAttributes (/IA) appear to be fixed with below:-

https://github.com/tjscience/RoboSharp/blob/a389038921326434884e0d97d396b73eeeab9e16/RoboSharp/Results/ProgressEstimator.cs#L310-L313

https://github.com/tjscience/RoboSharp/blob/a389038921326434884e0d97d396b73eeeab9e16/RoboSharp/RoboSharpConfiguration.cs#L192-L200

PCAssistSoftware commented 2 years ago

SelectionOptions.MaxFileSize (/MAX), SelectionOptions.MinFileSize (/MIN), SelectionOptions.MaxFileAge (/MAXAGE), SelectionOptions.MinFileAge (/MINAGE), SelectionOptions.MaxLastAccessDate (/MAXLAD), SelectionOptions.MinLastAccessDate (/MINLAD) appear to be fixed with below:-

https://github.com/tjscience/RoboSharp/blob/a267017b524afd7e6a848c595b6ae19bcb56aee8/RoboSharp/Results/ProgressEstimator.cs#L314-L329

https://github.com/tjscience/RoboSharp/blob/a267017b524afd7e6a848c595b6ae19bcb56aee8/RoboSharp/RoboSharpConfiguration.cs#L202-L240

For info - this utility proved really handy in manipulating file created, modified and accessed dates/times for testing - https://www.nirsoft.net/utils/bulk_file_changer.html

PCAssistSoftware commented 2 years ago

ExcludeChanged (/XC) appear to be fixed with below:-

https://github.com/tjscience/RoboSharp/blob/c04c00d9fb5be6486a68cdfc9c1373c781fb1dde/RoboSharp/Results/ProgressEstimator.cs#L330-L336

https://github.com/tjscience/RoboSharp/blob/c04c00d9fb5be6486a68cdfc9c1373c781fb1dde/RoboSharp/RoboSharpConfiguration.cs#L242-L250

PCAssistSoftware commented 2 years ago

Still to do:-

/IS (include same) /IT (include tweaked)

PCAssistSoftware commented 2 years ago

@RFBomb - before I go too much further I would appreciate your looking over and commenting on changes made to see if seems okay so far?

PCAssistSoftware commented 2 years ago

Thanks @RFBomb - changes made and works fine for listonly, but same issue I was getting when playing last night and this morning where 1 file out for copy - showing 1 file as skipped and total = 4 rather than 5

image

You should have permissions now for commit - hopefully if I clicked right buttons....

RFBomb commented 2 years ago

I bet the first 0kb file is the one that ended up as 'skipped', and since OnCopyProgressChanged never occured for the second one, it was not counted.

Just made a commit

PCAssistSoftware commented 2 years ago

I bet the first 0kb file is the one that ended up as 'skipped', and since OnCopyProgressChanged never occured for the second one, it was not counted.

Just made a commit

Trust it to be my annoying 0kb test files!!!! ;)

Unfortunately still showing wrong - see below - getting 1 skipped from somewhere even though none were skipped

image

PCAssistSoftware commented 2 years ago

What is the easiest way for me to see which file is it picking up as skipped?

I have attached my source files I am using to do this test so at least we are using same files to solve it

same.zip

RFBomb commented 2 years ago

breakpoint Line 400 of PerformByteCalc()

In this case though, its easy because its your 7-byte file countes as both copied and skipped

PCAssistSoftware commented 2 years ago

breakpoint Line 400 of PerformByteCalc()

In this case though, its easy because its your 7-byte file

That was spooky - I was literally then adding a console.writeline to show me the name to line 400 as this email came in!!

It is as you said the 7-byte file called 8.rtf

RFBomb commented 2 years ago

Just did another commit, maybe its fixed (hopefully doesn't break anything!)

PCAssistSoftware commented 2 years ago

Just did another commit, maybe its fixed (hopefully doesn't break anything!)

That fixed it - but odd behaviour if I remove SelectionOptions.IncludeSame = True

See below with IncludeSame not being used, the difference between ListOnly and Run for same job - bytes showing different.

image

PCAssistSoftware commented 2 years ago

N.B. it wasn't the last change that broke the bytes as just tried reverting it and it is same, so not sure

Sorry to be a pain

RFBomb commented 2 years ago

Try that. I moved //IdenticalFiles back outside the 'ListOnly' operation

PCAssistSoftware commented 2 years ago

IncludeTweaked (/IT) appears to be fixed with below:-

https://github.com/tjscience/RoboSharp/blob/a013dcb09b9faee4caefe268766e0b46e256af3a/RoboSharp/Results/ProgressEstimator.cs#L339-L345

https://github.com/tjscience/RoboSharp/blob/a013dcb09b9faee4caefe268766e0b46e256af3a/RoboSharp/RoboSharpConfiguration.cs#L252-L260

PCAssistSoftware commented 2 years ago

Try that. I moved //IdenticalFiles back outside the 'ListOnly' operation

That seems to have fixed it :)

Just added IncludeTweaked as well which was the last one I had on my list to add I think, so progress estimator should now work correctly for:-

ExcludeAttributes /XA IncludeAttributes /IA MaxFileSize /MAX MinFileSize /MIN MaxFileAge /MAXAGE MinFileAge /MINAGE MaxLastAccessDate /MAXLAD MinLastAccessDate /MINLAD ExcludeChanged /XC IncludeTweaked /IT IncludeSame /IS

PCAssistSoftware commented 2 years ago

And I believe now covers all possible file and directory tags as per page 30 onwards of https://theether.net/download/Microsoft/Utilities/robocopy.pdf

Apart from lonely - but couldn't see the point for adding that??!?!....

The file exists in the source but not in the destination (a Lonely file).The file is skipped; to copy this file, omit /XL.

RFBomb commented 2 years ago

So, that would be use if you have say a source and destination with 20 files each that they share.

For arguments sake, lets say they share 1-18. SOURCE has 19.txt & 20.txt, while Dest has 21.txt and 22.txt.

21.txt and 22.txt would be 'EXTRA' since they exist in the destination, but not in source. 19.txt and 20.txt would be 'LONELY' since they exist in source, but not in destination.

Normally, 1-20 would be copied into the destination. But with the /XL switch, 19 and 20 are ignored, and instead only 1-18 are copied over, since those are the files that are shared between source and destination.

I believe that would mean they are added to the 'extra' count? I'm not sure. or maybe the 'mismatch' count

PCAssistSoftware commented 2 years ago

That make sense.

I will do some more testing tonight or tomorrow on all the new switches supported just to ensure running as it should be, and unless you can think of anything else I will then merge?

PCAssistSoftware commented 2 years ago

Update:

I have been through each and every combination of the switches above that is now supported in progress estimator, I have run with switches true, false and set to various sizes / dates etc and all results showing in RoboSharp are the same as what I am getting on the command line running RoboCopy!! And to make sure I have gone through each test multiple times as I really don't want to break any of your hard work!......

Out of interest I have also been monitoring memory / CPU usage during all these tests and that is also behaving very well.

I then went back and repeated all the original tests we were doing a few days ago just to make sure the changes hadn't broken anything and I do have one little issue though.

Using the files attached as a source - just running one single source/destination with RoboQueue and the figures for bytes are wrong during copy when specifying files to exclude. Same job without using excluding the files is fine. And figures doing list only is fine.

Job settings:

job2 = New RoboCommand
job2.CopyOptions.Source = "C:\Users\darre\Desktop\1c"
job2.CopyOptions.Destination = "C:\Users\darre\Desktop\2"
job2.CopyOptions.CopySubdirectoriesIncludingEmpty = True
job2.LoggingOptions.VerboseOutput = True
job2.Name = "JOB2"

job2.SelectionOptions.ExcludedFiles.Add("myfile.txt")
job2.SelectionOptions.ExcludedFiles.Add("my file.txt")

image

source.zip

RFBomb commented 2 years ago

that looks like '5.pdf' got passed into PerformByteCalc instead of one of the excluded files during the COPY operation.

That, or else if (SkippingFile && CurrentFile != null) on line 169 was TRUE.

Can you run your test again later, breakpoint 166 and 171 and see if those are hit. (They shouldn't be hit if PerformByteCalc was already performed against the file)

PCAssistSoftware commented 2 years ago

Both below were FALSE

image

RFBomb commented 2 years ago

The 'copy' test shows that the two 3b files were ignored for the calculation, and the 7kb file plus the '5.pdf' file were used for the math instead.

Which is really curious.

RFBomb commented 2 years ago

I cannot duplicate. I can.

EventLogs.zip

PCAssistSoftware commented 2 years ago

It is the results showing in the estimator which are wrong for copy NOT the results showing in the list-only results / run results section

image

image

PCAssistSoftware commented 2 years ago

Just noticed you had edited your last response just as I was posting this!...

RFBomb commented 2 years ago

Yea, I was looking at results, not estimator lol Fixed the typo that caused it to look at incorrect variable. CurrentFile != currentFile

PCAssistSoftware commented 2 years ago

Good spot - so easy to miss - confirm it is now working correctly for me :)

Good idea to add checkbox to enable debugging files

If you are completely happy with the changes made in this PR then I will merge it shortly?

RFBomb commented 2 years ago

As I don't really intend to use progress estimator ( I just wanted it to get something from cancelled jobs since I have some very large file operations over slow connections to perform that users are likely to cancel), and I don't intend to use the more advanced switches, I trust your testing on them.

I definitely do not expect people to come in and define all those tags themselves for the other languages (if they even change for other languages). But if they do change, and support is requested, the language files can always be looked at, or they can use the 'DefaultConfigurations' setup I've mocked in for setting up those alternate language defaults.

As such, I'm happy with the changes. I'm also glad that we were able to do more testing and really tweak in the performance side of things, since I know my first iterations on it were somewhat taxing (but I didn't know how to do it better, as it was my first time really dealing with something threaded that heavily).

The DateTime stuff you requested (and I agreed with) is also ready for you to test and take a crack at. It wound up being a bit more intensive changes than I had originally anticipated to do it in a fashion I thought was 'correct' and 'normalized'. I wrote it all up, but have done 0 testing against it

PCAssistSoftware commented 2 years ago

As I don't really intend to use progress estimator ( I just wanted it to get something from cancelled jobs since I have some very large file operations over slow connections to perform that users are likely to cancel), and I don't intend to use the more advanced switches, I trust your testing on them.

That's fair enough. Yes I have tested and tested again all the advanced switches I have added support for and all results tally exactly with what I see from doing same job in command line with RoboCopy, so think I am happy with that.

I definitely do not expect people to come in and define all those tags themselves for the other languages (if they even change for other languages). But if they do change, and support is requested, the language files can always be looked at, or they can use the 'DefaultConfigurations' setup I've mocked in for setting up those alternate language defaults.

Agreed

As such, I'm happy with the changes. I'm also glad that we were able to do more testing and really tweak in the performance side of things, since I know my first iterations on it were somewhat taxing (but I didn't know how to do it better, as it was my first time really dealing with something threaded that heavily).

Definitely worth it as runs so much better now :)

The DateTime stuff you requested (and I agreed with) is also ready for you to test and take a crack at. It wound up being a bit more intensive changes than I had originally anticipated to do it in a fashion I thought was 'correct' and 'normalized'. I wrote it all up, but have done 0 testing against it

Okay will merge this one and then look at testing the DateTime PR