xenon-middleware / xenon

A middleware abstraction library that provides a simple programming interface to various compute and storage resources.
http://xenon-middleware.github.io/xenon/
Apache License 2.0
33 stars 17 forks source link

Local adaptor fails on windows #185

Closed jmaassen closed 11 years ago

jmaassen commented 11 years ago

Just before creating a 1.0-rc1 release I remembered to have a look at the failing windows test running on Jenkins (why wasn't there an issue about that ? ;-)

I expected it to be just a simple "wrong slash" problem, but it is actually a more fundamental API problem that I would like some feedback on....

Currently we have the following classes:

FileSystem represents a file system. It has an entry path which shows you where you've entered the file system. Usually this location is special in some way (like "/home/jason").

Pathname represents a sequence of path elements and a separator, for example ("dir", "file" + "/"). It is a convenience class allowing you to manipulate paths. You can turn the Pathname into a string containing the relative path "dir/file" or absolute path "/dir/file".

Path is a FileSystem+Pathname that really points to some bits on some storage somewhere.

What happens inside octopus is that the FileSystem typically contains the information needed to connect to some storage ("local:/" or "sftp:/machine") and the Pathname is expected to contain an absolute path on that storage. So Filesystem("sftp", "machine") and Pathname("dir", "file" + "/") turns into "sftp://machine/dir/file".

Now this works fine for Linux, SSH, SFTP, etc. but on Windows it all goes pear shaped.

The problem is that in Linux the "root" of the file system is the same as the "separator" of the pathnames, namely "/". Therefore, we can create an absolute path by just sticking the separator in front of whatever a pathname contains. And since "/" is the only root there is in Linux this always works.

Simply put, we got away with making "/" the implicit root in FileSystem.

On windows, however, you can really have multiple roots, like the C: we all know and love, but also things like \\server\volume, \\?\E:, \\?\server\volume, and even \\?\UNC\server\volume.

Now the problem is that this is hard to put into a Pathname because this would require some "special root element" besides the regular path elements and separator. In addition you must have this root element to turn a pathname into an absolute path. Also, the semantics become unclear when you resolve a pathname with a root against a filesystem that has a different root....

What make more sense (a lot of sense actually) it to add add the concept of a root to FileSystem. This would allow you to have a FileSystem(/) on Linux, and a FileSystem(C:) and/or FileSystem(//server/volume) on windows. This also nicely corresponds to the idea of having remote file systems like FileSystem(sftp://machine).

A Pathname would then always contain a relative path that needs to be resolved against a FileSystem in order to be combined with its root to create a Path that actually points to some bytes on some storage somewhere...

This seems to me to be a nice clean solution that is also conceptually "correct".

Any comment on this ?

ps: This solution does imply that RelativePath is better name than Pathname... Don't you just love it when you come full circle ;-)

jmaassen commented 11 years ago

Hmmm.. I've been fiddling with this for a while, but is seems thing get quite complicated quite quickly.

Please ignore my question for now until I come up with something better ;-)

nielsdrost commented 11 years ago

It seems under windows the "drive letter" is encoded in the path part of a uri:

http://blogs.msdn.com/b/ie/archive/2006/12/06/file-uris-in-windows.aspx

So that would be an option. I suggest we let the drive letter be part of the FileSystem location, and use the path part of the URI to determine the drive letter (if specified)

The result would be:


//probably on C, but you never know
FileSystem localhome = files.getLocalHomeFileSystem()

FileSystem usbFs = files.newFileSystem(new URI("file:///D:/"))

FileSystem cdromFs = files.newFileSystem(new URI("file:///E:/"))

Pathname cdromPathname = new Pathname("autorun.ini")

Path cdromFile = files.netPath(cdromFs, cdromPathname)

Not sure if this is exactly what you meant with your suggestion though ;-)

jmaassen commented 11 years ago

Well... the the beauty and problem of this approach is that the driveletter becomes "tied" to a specific filesystem instance. This implies that the Pathname is always a "relative" thing that you need to resolve against a FileSystem to really point somewhere.

This all sounds great so far...

The problem is now what happens when you do things like:

FileSystem fs = files.newFileSystem(new URI("file:///C:/"), ...);
Path path = files.newPath(fs, new Pathname("test"));
Pathname name = path.getPathname();
String x = name.getPath();

What would you expect x to be ? Probably C:/test, but it will give you test instead since Pathname is only the relative part. The "C:" is part of the filesystem, not the pathname...

As an alternative we can include the drive letter in the Pathname instead of the FileSystem. This is similar to what is done in java.nio where a FileSystem can just point to the local fs, and allow you to create paths like "c:\test" or "e:\test" without having to create special filesystems for this.

This does imply that Pathname understand the difference between relative and absolute paths to some extend, and that it is aware that "C:" is special somehow. Otherwise tha path manipulations could get very messy.

I've implemented this to see if this works and I'm trying to run all the tests under windows. Unfortunately, many of the tests sort of implicitly assume that the root is "/" instead of "C:" or "E:"...

jmaassen commented 11 years ago

Fixed in ccedc6a49f95cc611ae214c9306ea609c5acdff1