Closed ross-spencer closed 8 years ago
Thanks for reporting Ross.
There's an unchecked error there which I can tidy up to avoid the panic. But the underlying error may be difficult to address, it looks to be this: https://github.com/golang/go/issues/3358
Feedback from the error will help others and prevent them from having to chase it down. It seems we'll just have to work around the other issue outside of SF.
(Unless there was a way to simulate jumping into each directory one by one, referencing the next relatively instead of absolutely to ask isDir() so that we're never asking the call to look at a directory path as long? - I would have to understand what the SDK is doing better - I imagine the Go code base would already have considered that if possible)
there should be a workaround ... the docker devs ran into this and made this pkg: https://github.com/docker/docker/tree/master/pkg/longpath Can probably do something similar
fixed https://github.com/richardlehane/siegfried/commit/06f73f3b6df59131e81042ac11c9a3fd20e42f0b
(on os errors: try again with longpath code from docker repo)
Tested, and the it works:
filename : 'folder-test\cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuv
wxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopq\test.txt'
I'm wondering what the impact is on that tip in your SF recipes for piping a list of files to STD out. Without the drive letter does it reduce the number of potential uses like that in Windows?
hmmm... this should be fine so long as you don't change directories between commands I think? I.e. if it is a relative path and you are still in the right directory then that relative path will still resolve.
I was thinking about this the other day: is it better to return the filenames as passed to sf (relative or absolute) or should filenames all get turned into absolute paths (with drive letter etc.).
At the moment users get control over what kind of filenames get returned.
E.g. say I am in my desktop path I can either do:
sf cool_file.txt
and "filename: cool_file.txt" comes back Or:
sf c:\Users\richardl\Desktop\cool_file.txt
and "filename: c:\Users\richardl\Desktop\cool_file.txt" comes back Or even:
sf %cd%\cool_file.txt
which also gives "filename: c:\Users\richardl\Desktop\cool_file.txt"
The advantage of the way it presently is is that there might be legitimate reasons not to show the file system context your files are in (e.g. because you are processing a consignment on a lab PC and the meaningful context is the relationships between the files in a consignment folder and not the position of the consignment folder on the lab PC).
Disadvantage is that many users may need absolute paths for further processing work but not want to jump through hoops like typing full paths or using that %cd% trick to get there.
Options:
OK ... I combined options 1) and 4).
Have added a tip and trick (https://github.com/richardlehane/siegfried/wiki/Tips-and-Tricks)
Also changed the logging output so the logging features always return absolute paths https://github.com/richardlehane/siegfried/commit/7eaa3d011fc841d755d0ed2496c7be44ceafe280
Thanks Richard. I wouldn't have been able to answer straight off. I think I will need to work through it with SF a little more and understand where the impacts are/how to work with it. I'm sure a combination of one and four will work well.
I'm seeing that long path syntax in output now, e.g.
\\?\E:\digital_archeology\sf\...\...\...
Should (can?) SF maintain a record of the original path before the IsFolder() check?
Hi Ross Where you see the long path in output ... are you only seeing it for long file paths, or everywhere? This should relate to a second fix I applied to recover from long directory paths (the first fix only worked for file paths). It may be a little tricky to restore original paths post this fix but I will take a look
In this case it seems to be in the final report (DROID output from SF). As I haven't been able to test this more rigorously this week, I am happy to rebuild SF at the beginning of next week and provide you with more information. It doesn't seem to be a long path, e.g. \?\E:\digital_archeology\sf###\ARCHIVES\REVIEWS##################92.DOC
On Thu, Dec 10, 2015 at 2:10 PM, Richard Lehane notifications@github.com wrote:
Hi Ross Where you see the long path in output ... are you only seeing it for long file paths, or everywhere? This should relate to a second fix I applied to recover from long directory paths (the first fix only worked for file paths). It may be a little tricky to restore original paths post this fix but I will take a look
— Reply to this email directly or view it on GitHub https://github.com/richardlehane/siegfried/issues/58#issuecomment-163454677 .
You might get this for a short path simply because sf will try a long path as a second attempt when it encounters any error. If the original error was something for which a simple retry might work (e.g. bandwidth related), then you may get long paths in your results. This would especially show up where you are throttling - and a longer throttle may eliminate it.
A fix might be to try to identify the exact error codes for path length errors and only try the longpath fix for those. In your case, this would mean you'd get a fatal error here instead (because the retry would never have happened). Or could keep the retry with long paths for any error, since it seems to have an unplanned nice side effect of recovering from other kinds of failures!, but add a new function to restore short paths.
Hi Ross I pushed a change on develop branch to try to restore original paths if the long path fix is applied. I may work on it further as it is all a bit of a hairy mess now but would be interested to hear if it resolves the latest issues you've been encountering.
Hi Richard.
I haven't been able to re-create this reliably since my return. I have been able to run the develop branch fix on the same source files and the files-paths with \?\ in them look correct, that is they no longer have the \?.
I am running the same set against the master branch properly overnight and should have the results tomorrow.
Just to keep you informed.
It does look like you will be able to close this issue out.
Thanks,
Ross
thx Ross - happy to reopen it if it recurs
Given a structure; /e/DC_Spencer/folder-test/ab/cdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopq
With a file in it: abcd.txt
SF will:
panic: runtime error: invalid memory address or nil pointer dereference [signal 0xc0000005 code=0x0 addr=0x14 pc=0x41087e]
goroutine 1 [running]: main.multiIdentifyS.func1(0x12938c30, 0xeb, 0x0, 0x0, 0x330602e0, 0x1292c0a0, 0x0, 0x0) C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:123 +0x1be path/filepath.walk(0x129384b0, 0xe2, 0xca4558, 0x12d541e0, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:370 +0x33d path/filepath.walk(0x128c21a0, 0x2, 0xca4558, 0x12d540f0, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:374 +0x3f9 path/filepath.Walk(0x128c21a0, 0x2, 0x12921c38, 0x0, 0x0) c:/go/src/path/filepath/path.go:396 +0xb7 main.multiIdentifyS(0x330601c8, 0x12d76008, 0x12d54050, 0x128c21a0, 0x2, 0x0, 0x0, 0x0) C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:135 +0x87 main.main() C:/Source/git/go/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:344 +0x1e1e
If you shorten that path by removing the 'ab\' directory then it works. Length 258 characters vs. 260.
Max path length specified here: https://msdn.microsoft.com/en-nz/library/windows/desktop/aa365247(v=vs.85).aspx
We have a real example I can't provide as its structure describes the content of the un-accessioned files we're working with.
Directory lengths like this can be created by transferring from Linux using Rsync or in Windows by using Cygwin to mkdir.