Closed regkrock closed 3 years ago
After Dale's comments on making sure that this will work for other platforms, I looked a bit at Windows file API. The number of possible options for file opening and creating in Windows is quite extensive, and rather different than those in Unix. Trying to put these into a hierarchy of all possible legal combinations would be large and inflexible. The available options change slowly, but they do change.
So I think it would be better to have #open:options: (and FileReference>>openWithOptions:) where the options argument is an object that composes options from the available options. Each option (and we probably have very few options until we replace GsFile with something more flexible) could be a class that is an immediate subclass of FileOpenOption or something like that. Regards, -Martin
When I created the hierarchy I had some concerns.
Do you have a reference for the Windows combinations.
Also, should we be including the gzip options as well or is that not being handled via the FileSystem?
Regards,
Reg
If you're curious, the Windows documentation for file opening is https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-createfilew
For Linux, which should be very similar to options for other Unix systems, including MacOS: https://linux.die.net/man/2/open
But I don't think we need to implement any of these right now. Few (none?) are currently supported by GsFile, so we will need to wait until we have a more capable lower layer to build on. What I'd like to see now is a framework that is flexible enough to add file opening attributes later.
Compression such as gzip (and we want to add LZ4 compression before too long, but not in GsFile) should not be an attribute of an open file. Compression is an attribute of what streams you stack on top of the file. All files are binary, but you can interpret them with an encoding stream. All files are uncompressed, but you can read or write them with a compressing stream.
EDIT: It has been pointed out to me that the options O_RDONLY, O_WRONLY, or O_RDWR exist in Unix. So the following statement is false. (Fake news: In Unix, all files are opened read-write) -- (This is true, however -- notice that in Linux Pharo you can open a file with openWritable: false and still write to the file). Regards, -Martin
Based on the comments above and looking at the links this is an attempt to capture the essence of the longer term needs and current state of affairs I will put forward some ideas:
Create a class which holds more information than a FileMode, lets call it FileSpec
Mode - knows the following:
Knowledge
Classes and related hierarchies FileSpec
FileMode +-- FileAppendMode +--FileReadMode +--FiileReadWriteMode +--FileWriteMode
FileType +--TextFileType +--BinaryFileType
CompressionType +--NoCompression +--GZip
Regards,
Reg
Hi Reg,
I don't think it would be the best design to put all that knowledge in one object. A more flexible design (and one more like other Smalltalks are moving toward) is to have objects that the user assembles together to provide the desired functionality. Each object has a narrow responsibility. In the case of file usage, there's the object that represents an open file (FileHandle) and a stack of streams on top of that, each stream providing a specific value add such as buffering, encoding, or compression.
Dale and I talked to Allen yesterday, and it looks like we may actually have a replacement for GsFile for GemStone 3.5! If not that soon, it seems likely to be pretty soon. This API will have a much more modern design, and be much simpler and low-level. Given that, we don't want to use GsFile's design as a model for the design of GsFileSystem.
After doing a bunch of reading of code and of the Pharo mailing list, I think there should be two paths for opening a file.
1) The default path is to send #readStream, #writeStream, #binaryReadStream, or #binaryWriteStream to a FileLocator or FileReference. This opens the file with default attributes and gives you the requested type of stream.
2) If you want to open the file with non-default options, send a FileLocator or FileReference #openWithAttributes:, where the argument is an object that composes some number of the possible attributes. The answer is a FileHandle. You can then get a stream from a FileHandle with the #readStream, #writeStream, #binaryReadStream, or #binaryWriteStream messages.
The #readStream and #writeStream messages should use default buffering, and platform-dependent default encoding -- UTF-8 for Unix, UTF-16 for Windows. This is also the encoding for filenames on those platforms.
So a FileHandle represents a file opened with some set of attributes. Things that should be attributes of a FileHandle:
Other things, such as Unix O_APPEND and O_TRUNC file status flags, can be added later, once we have replaced GsFile. In the meantime, these are easily simulated in common situations with file actions after opening.
Things that are not part of opening a file, and therefore should not be part of a FileHandle's options or state:
Text vs Binary: This is dependent on what streams you use to read or write the file, not on the FileHandle itself. Also, in the modern world there is no one "text" mode, in either Unix or Windows. There are various possible encodings, which are implemented with encoding/decoding streams.
Compression: This is done by compressing/decompressing streams, not by the file itself. There are also an open-ended number of ways to compress a file (gzip, lz4, lzo...).
For the file position, the FileHandle does not need to keep track of this, since the underlying operating system file knows its position. There do need to be messages that can be sent to a FileHandle to retrieve the position, and to set it. Normally, these messages would only be sent by the stream that is interacting directly with the FileHandle.
With this kind of design, the classes needed would be:
FileHandleAttributes (a composition of FileAttribute classes)
FileAttribute (a single attribute) FileRead FileWrite (and more FileAttribute classes later, but those will do for now)
And whatever Stream classes are needed. For now, the basic Zn streams will do. We'll be adding Xtreams support, which will have a full set of buffering, encoding, compressing, and encrypting streams.
I hope that all this is more clarifying than confusing. I suspect I haven't yet made my thoughts entirely clear. Please continue the discussion as necessary.
Regards, -Martin
Hi Martin, The following is based on what I garnered from other Smalltalks and the sites you referenced. Let me know what you think.
The Default (Linux) column shows the default numbers for each option. The Windows columns shows the windows value.
Class Posix Name Default (Linux) Windows
======================== ============ ================ =======
FileOption
FilePositioning
FileSeekCurrentPosition SEEK_CUR 1
FileSeekSetPosition SEEK_SET 0
FileSeekEndPosition SEEK_END 2
FileLockOption
FileWriteLockOption F_WRLCK 2
FileReadLockOption F_RDLCK 1
FileMandatoryLockOption F_MDLCK 0
FileOpenMode
FileOpenReadWriteMode O_RDWR 2
FileOpenReadOnlyMode O_RDONLY 0
FileOpenWriteOnlyMode O_WRONLY 1
FileCreationOption
FileExclusiveOption O_EXCL 128 262144
FileDirectoryOption O_DIRECTORY not supported
FileDirectOption O_DIRECT not supported
FileDoNotUpdateAccessTimeOption O_NOATIME not supported
FileCloseOnExecOption O_CLOEXEC not supported
FileKeepSymbolicLinksOption O_NOFOLLOW not supported
FileCreateOption O_CREAT 64 131072
FileNotControllingTerminalOption O_NOCTTY not supported
FileTruncateOption O_TRUNC 512 524288
FileStatusOption
FileAsyncOption O_ASYNC not supported
FileRsyncOption O_RSYNC not supported
FileDsyncOption O_DSYNC not supported
FileNonBlockOption O_NONBLOCK 128
FileSynchronousOption O_SYNC not supported
FileAppendOption O_APPEND 1024 65536
FileShareOption
FileDenyReadWriteOption O_DENYRDWR 2 16
FileDenyNoneOption O_DENYNONE 0 64
FileDenyWriteOption O_DENYWR 3 32
FileDenyReadOption O_DENYRD 1 48
FileControlOption
FileCloseOnExecutionOption F_DCLOEXEC 1
FileTypeOption
FileGzipTypeOption O_GZIP 0
FileTextTypeOption O_TEXT 0
FileBinaryTypeOption O_BINARY 0 1
I have also created a FileOptions class to manage the various file options. It has the following instance variables:
The FileOption class has also had a number of instance creation convenience methods added:
In GsFile, the following method has been implemented to support opening a file from FileSystemGs.
open: aPathName withOptions: aFileOptions
"This is the current way of openning a file"
^self open: aPathName mode: aFileOptions modeString onClient: aFileOptions isClientFileSystem withCompression: aFileOptions isGzipped.
Using FileOptions to open a file in Windows and Unix
Below I describe what I have implemented to support for opening files. The 'FileHande' classes described below are not implemented. They are just psuedo-code to highlight how the FIleOptions class works.
1) Windows For windows compatibility withe the Posix file options, the following methods have been added to WIndowsStore on the class side.
Access Rule | Value |
---|---|
readOnlyAccessRule | 2147483648 |
readWriteAccessRule | self readOnlyAccessRule bitOr: self writeOnlyAccessRule |
writeAccessRule | 1073741824 |
Create Rule | |
createNewRule | 1 |
openAlwaysRule | 4 |
openExistingRule | 3 |
truncateExistingRule | 5 |
Share Rule | |
denyNoneShareRule | 3 |
denyReadShareRule | 2 |
denyReadWriteShareRule | 0 |
denyWriteShareRule | 1 |
So the low level call for opening a file in Windows (32 bit) would be similar to:
....
(securityAttributes := CfsOSSecurityAttributes new)
lpSecurityDescriptor: nil;
bInheritHandle: true;
nLength: CfsOSSecurityAttributes fixedSize.
....
result := self
platformFunctionCreateFile: filename
fdwAccess: aFileOptions accessRule
fdwShareMode: aFileOptions shareRule
lpsa: securityAttributes
fdwCreate: aFileOptions createRule
fdwAttrsAndFlags: WindowsStore fileAttributeNormal "This is 128"
hTemplateFile: nil).
result asInteger = Windows invalidHandleValue.
ifFalse: [
aFileOptions hasTruncateOption
ifTrue: [
result := self
platformFunctionCreateFile: filename
fdwAccess: aFileOptions accessRule
fdwShareMode: aFileOptions shareRule
lpsa: nil
fdwCreate: aFileOptions truncateExistingRule
fdwAttrsAndFlags: WindowsStore fileAttributeNormal "This is 128"
hTemplateFile: nil].
WindowsFileHandle>>platformFunctionCreateFile: truncFileName fdwAccess: accessRule fdwShareMode: shareRule lpsa: lpsa fdwCreate: createRule fdwAttrsAndFlags: fileAttribute hTemplateFile: hTemplateFile
<c: pointer 'kernel32.dll':CreateFileA pointer uint32 uint32 pointer uint32 uint32 pointer>
^self primitiveFailed
2) Unix/Linux Open The Unix/Linux option is very straight forward as the code below shows. Note: Share is not used in Linux based on the code I was looking at.
...
self
platformFunctionOpen64: path nullTerminated
oflag: aFileOptions fileOpenAndOptionsValue
mode: aFileOptions accessPermissions
UnixFileHandle>>platformFunctionOpen64: path oflag: oflag mode: mode
<c: int32 Open64 safepointer int32 uint32>
^self primitiveFailed
The following are the Posix errors that I defined:
I am not certain if the WindowsValue makes sense or not.
PosixName UnixValue WindowsValue
FilePosixError
FileSynchronizedIONotSupportedError EINTR not supported
FileBadFileDescriptorError EBADF 10 9
FileNoEntryError ENOENT 8 2
FileBusyError EBUSY 12 16
FilePermissionDeniedError EPERM 7 1
FileIOError EIO 9 5
FileImproperLinkError 14 18
FileNoSpaceError ENOSPC 18 28
FileInvalidOptionError EINVAL 16 22
FileMaxFilesOpenError ENFILE 17 23
FileIsDirectoryError EISDIR 15 21
FileReadOnlyFileSystemError EROFS 19 30
FileTooManySymbolicLinksError ELOOP not supported
FileNameToLongError ENAMETOOLONG not supported
FileAccessError EACCES 11 13
FileExistsError EEXIST 13 17
FileDeviceNotSameError EXDEV 14 18
On 1/7/19 7:49 PM, Reg Krock wrote:
Hi Martin, The following is based on what I garnered from other Smalltalks and the sites you referenced. Let me know what you think.
Thanks, Reg.
I have some questions and comments.
It's not clear to me exactly how you intend to use these classes. You mention instances of these classes -- what state do they have? Is one instance of, say, FileCreateOption different from another instance of FileCreateOption? If not, it might be better to use the class as a named singleton and not instantiate these things.
When you say "not supported" in the Linux column, does that mean that you are proposing that we not support these? Some of them seem useful and worth supporting eventually.
Other detailed comments interspersed below. I'm primarily looking at Linux, not MacOS (similar but not identical), or Windows (very different) in these comments.
Regards,
-Martin
The Default (Linux) column shows the default numbers for each option. The Windows columns shows the windows value.
|Class Posix Name Default (Linux) Windows ======================== ============ ================ ======= FileOption FilePositioning FileSeekCurrentPosition SEEK_CUR 1 FileSeekSetPosition SEEK_SET 0 FileSeekEndPosition SEEK_END 2 | These FilePositioning options are not attributes of a file, but of an individual seek operation. Therefore, they should not be included in the FileOption. Probably better to not have the classes at all, but to offer them as options in whatever #seek API we end up with. |FileLockOption FileWriteLockOption F_WRLCK 2 FileReadLockOption F_RDLCK 1 FileMandatoryLockOption F_MDLCK 0 |
|File locking is similar -- it's not an attribute of a file, but an operation one can do on a file through fcntl or flock. Also, I can't find F_MDLCK anywhere except as a proposed (but apparently never implemented) option in Cygwin. There are also other options (F_UNLCK, LOCK_SH, LOCK_EX, LOCK_UN) that would be implemented by options to a locking API. We might want a "lock options" object as an argument for a locking API, but I'd want to see a design for that API to be sure.|
||
|FileOpenMode FileOpenReadWriteMode O_RDWR 2 FileOpenReadOnlyMode O_RDONLY 0 FileOpenWriteOnlyMode O_WRONLY 1|
|For these, I'd rather see separate composable "readable" and "writeable" attributes, since the combination of "neither readable nor writeable" is actually useful in Linux.|
|Also, please avoid use of "mode" in names. These names communicate well even without "mode," and "mode" has some a specific meanings in UNIX. A mode of read-only means that the file's permissions are read-only, not that the file is opened read-only. So it can be confusing to use that word in this context.|
FileCreationOption FileExclusiveOption O_EXCL 128 262144 FileDirectoryOption O_DIRECTORY not supported FileDirectOption O_DIRECT not supported FileDoNotUpdateAccessTimeOption O_NOATIME not supported O_DIRECT and O_NOATIME should be FileStatusOptions, not FileCreationOptions. Linux open(2) defines O_CLOEXEC as a file creation option, but I'm not sure why, since it appears to be fully functional on extant files. FileCloseOnExecOption O_CLOEXEC not supported FileKeepSymbolicLinksOption O_NOFOLLOW not supported FileCreateOption O_CREAT 64 131072 FileNotControllingTerminalOption O_NOCTTY not supported FileTruncateOption O_TRUNC 512 524288 There is also O_TMPFILE to create temporary files, but that one is so special-purpose that it seems like a separate API for temp file creation is justified, rather than putting it in the FileOptions. FileStatusOption FileAsyncOption O_ASYNC not supported FileRsyncOption O_RSYNC not supported O_RSYNC is not supported in Linux. Might be in MacOS, though. FileDsyncOption O_DSYNC not supported FileNonBlockOption O_NONBLOCK 128 FileSynchronousOption O_SYNC not supported FileAppendOption O_APPEND 1024 65536 There's also O_PATH, which might be a good thing to include, or at least think about including. FileShareOption FileDenyReadWriteOption O_DENYRDWR 2 16 FileDenyNoneOption O_DENYNONE 0 64 FileDenyWriteOption O_DENYWR 3 32 FileDenyReadOption O_DENYRD 1 48 Are the FileShareOptions Windows-specific? If so, I'd leave them out for now, though we'd want the design of the API to allow their addition for platforms that support Windows. FileControlOption FileCloseOnExecutionOption F_DCLOEXEC 1 I think you mean FD_CLOEXEC? Do you have an API in mind for doing file control operations? That could use this option, plus the ones that can be specified when a file is opened and then changed after the file is open. FileTypeOption FileGzipTypeOption O_GZIP 0 FileTextTypeOption O_TEXT 0 FileBinaryTypeOption O_BINARY 0 1 The FileTypeOptions are not attributes of a file in either UNIX or Windows, and should not be included in any form. I have also created a FileOptions class to manage the various file options. It has the following instance variables:
- mode - this contains an instance of a FileOpenMode subclass.
- options - this contains supported instance(s) of the subclass of FileCreationOption and/or an instance of FileTruncateOption.
- fileType - this indicates what the file type is. The Gzip type has been added here for compatibility with the current compression implmenetation.
We do not want compatibility with the current implementation. GsFile will continue to exist as is for existing users. FileSystemGs will exist alongside it, but no compatibility between the two is needed. Compression as a property of a file is one of the things that is wrong with the GsFile design, and should not be part of FileSystemGs.
- share - this indicates how the file is shared
- permissionOptions - these have not yet been implemented yet.
I think this might be more instance variables than needed. fileType can go away. mode can be folded into options. I'm not sure why share and permissionOptions aren't just more options either.
The FileOption class has also had a number of instance creation convenience methods added:
append, #read, #write, #readWrite, #appendAndRead, #readBinary,
writeWithHighCompression, #writeWithLowCompression, etc.
I'd rather not have any convenience messages, at least not yet. I expect this API to only be used directly by power users, and the FileSystemGs code itself. Perhaps two convenience methods, corresponding to what would be used by FileReference>>readStream and FileReference>>writeStream. We can always add more later if any other combinations end up being very frequently used, but let's try to keep the API lean.
In GsFile, the following method has been implemented to support opening a file from FileSystemGs.
|open: aPathName withOptions: aFileOptions "This is the current way of openning a file" ^self open: aPathName mode: aFileOptions modeString onClient: aFileOptions isClientFileSystem withCompression: aFileOptions isGzipped.|
|I'd rather not change GsFile in any way. It's the legacy world. For right now we have to build on top of it, but it won't be long before the two file worlds are completely separate, so using aFileOptions with GsFile seems to muddy the waters too much.|
Added the proof of concept of FileMode with subclasses to be passed instead of #write and #read. The classes which have been introduced are:
These classes return the appropriate mode string, (ex: ‘w+’) and know if the file is writable. Example of change - the text in blue shows the change.
Existing format for opening a stream over a file:
The #readWriteTruncatedMode method would return an instance of FileReadWriteTruncatedMode
If a form of this proposal were to be adopted, then the appropriate methods should be renamed to have mode. As an example,
#open:writeable:
would become#open:mode: