Closed hasufell closed 2 years ago
Thanks for the PR! Can I ask you to split it into two commits? One for the refactoring for the extraction to the internal functions and one for the new functions. They can still be in the same PR, I can just more easily filter away the refactoring changes.
I'm away till Friday so will do a more indepth review then, on first glance it looks great though!
Also, WindowsString
== CWString
and WindowsFilePath
==ByteString
if I'm reading this correctly?
Also for your create flags here https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L20 you'll want to add FILE_FLAG_OVERLAPPED
for WinIO otherwise it'll force the handle to behave synchronously (and so the I/O manager will be forced sequentially while this handle is in use).
You can also conditionally add the flag depending on which I/O manager is in use, see https://gitlab.haskell.org/ghc/ghc/-/wikis/migration/9.0#io-manager-winio-related-changes
Also which format will fp
be here? https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L15 I assume a device path? i.e.\\?\
?
Also there's a small bug with WriteMode
https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L27
WriteMode
expects the pointer to be at the start of the file, so opening a file in write mode expects the file to be overwritten (where-as append mode appends to the end).
So you have to reset the file size after opening it https://github.com/ghc/ghc/blob/378c0bba7d132a89dd9c35374b7b4bb5a4730bf7/libraries/base/GHC/IO/Windows/Handle.hsc#L856 otherwise the file won't be truncated on write.
for AppendMode
you also probably want GENERIC_WRITE
mode so the user can move the pointer around.
Also,
WindowsString
==CWString
andWindowsFilePath
==ByteString
if I'm reading this correctly?
newtype WindowsString = WS { unWFP :: BS.ShortByteString }
It's a ShortByteString
(unpinned bytearray), which is expected to be in UTF-16 format. That's why there is a specialized module System.AbstractFilePath.Data.ByteString.Short.Word16
for operations such as:
unfoldr :: (a -> Maybe (Word16, a)) -> a -> ShortByteString
...so we don't do any conversion whatsoever.
WindowsFilePath
is just a type synonym to WindowsString
.
We then use the specialized functions:
useAsCWString :: ShortByteString -> (Ptr Word16 -> IO a) -> IO a
The endgoal then is for the user to use AbstractFilePath
(as demonstrated in the file-io
package), where:
#if defined(mingw32_HOST_OS) || defined(__MINGW32__)
type PlatformString = WindowsString
#else
type PlatformString = PosixString
#endif
newtype AbstractFilePath = AbstractFilePath PlatformString
which has this nice trick that you can't unpack AbstractFilePath
to the wrong platform, because WindowsString
and PosixString
have different constructors. And still allows you to write platform specific code via those.
Thanks for the PR! Can I ask you to split it into two commits? One for the refactoring for the extraction to the internal functions and one for the new functions. They can still be in the same PR, I can just more easily filter away the refactoring changes.
I also want to note that another way to do this PR is heavy use of CPP and then include
with the appropriate functions imported and maybe ifdefs in function bodies. filepath
relies on this heavily, but I wouldn't claim that it makes it more maintainable. Up to you.
Also which format will
fp
be here? https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L15 I assume a device path? i.e.\\?\
?
The format is not specified. Does createFile
expect a device path? My tests seem to indicate that regular shorthand paths work as well.
Also which format will
fp
be here? https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L15 I assume a device path? i.e.\\?\
?The format is not specified. Does
createFile
expect a device path? My tests seem to indicate that regular shorthand paths work as well.
It does not expect it, but it accepts it. Without using a device path you'll regress the toolchain's long path support. GHC has internally mapped to device paths when the API supports it for many releases now.
Also which format will
fp
be here? https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L15 I assume a device path? i.e.\\?\
?The format is not specified. Does
createFile
expect a device path? My tests seem to indicate that regular shorthand paths work as well.It does not expect it, but it accepts it. Without using a device path you'll regress the toolchain's long path support. GHC has internally mapped to device paths when the API supports it for many releases now.
Doesn't that cause problems, because we're disabling filepath interpretation? E.g. C:/foo/bar
now means something different. So I'm not really sure what is expected here. My gut feeling is that I don't want to mess with user filepaths at all. That was one of the reasons for AFPP, although more about leaving encoding alone (which is not really a problem on windows, because we know the encoding).
Or put another way: is there safe way to convert arbitrary paths to \\?\
paths, emulating the entire windows filepath interpretation layer?
Edit: \\?\C:foo.txt
is also different from C:foo.txt
. Resolving it to an absolute path will not preserve semantics?
Also which format will
fp
be here? https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L15 I assume a device path? i.e.\\?\
?The format is not specified. Does
createFile
expect a device path? My tests seem to indicate that regular shorthand paths work as well.It does not expect it, but it accepts it. Without using a device path you'll regress the toolchain's long path support. GHC has internally mapped to device paths when the API supports it for many releases now.
Doesn't that cause problems, because we're disabling filepath interpretation? E.g.
C:/foo/bar
now means something different. So I'm not really sure what is expected here.
No, because when users use C:/foo/bar
they expect it to mean the same thing as C:\foo\bar
and we take care of that during the conversion. and on the win32 API layer /
and \
mean the exact same thing. Should the user want /
to not mean folder then they have to explicitly disable path processing in their input by using \\?\
in which case we don't touch the PATH.
My gut feeling is that I don't want to mess with user filepaths at all. That was one of the reasons for AFPP, although more about leaving encoding alone (which is not really a problem on windows, because we know the encoding).
The reason we do this is so that users can give ANY path to the compiler and have it work. The user shouldn't have to do \\?\
when they want a long path because 99% of users don't know what this is. Neither do scripts or other programs that can produce paths GHC consumes.
We in GHC decided to make this work out of the box because of things like cabal new-build that produce very long normalc:\
paths.
This isn't about AFPP or whatever. You're overriding openFile in an external module and not adhering to what the expectations are. If the user has to input \\?\
to get a long path it also means every tool down the pipeline must understand and interpret this correctly. Configure scripts and friends don't for instance. So we made an implicit decision to convert paths internally but not expose them externally unless the user has explicitly used a \\?\
path.
I don't particularly care what the library does or does not do with it's PATHs. I care about the fact that you're affecting the behavior of the I/O system in an external module that presumably will be part of a boot library.
Or put another way: is there safe way to convert arbitrary paths to
\\?\
paths, emulating the entire windows filepath interpretation layer?Edit:
\\?\C:foo.txt
is also different fromC:foo.txt
. Resolving it to an absolute path will not preserve semantics.
Correct. This path is almost non-existent in the real world, Very few people know what it does. Supporting this very arcane path does not warrant regressing the compiler's long path support for the vast majority of users. I'm pretty sure the conversion would fail for that path anyway so we'd use it as is.
Can you point me to the code where the conversion is done?
Can you point me to the code where the conversion is done?
https://github.com/ghc/ghc/blob/0619fb0fb14a98f04aac5f031f6566419fd27495/utils/fs/fs.c#L48
Also there's a small bug with
WriteMode
https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L27
WriteMode
expects the pointer to be at the start of the file, so opening a file in write mode expects the file to be overwritten (where-as append mode appends to the end).So you have to reset the file size after opening it https://github.com/ghc/ghc/blob/378c0bba7d132a89dd9c35374b7b4bb5a4730bf7/libraries/base/GHC/IO/Windows/Handle.hsc#L856 otherwise the file won't be truncated on write.
I think this is not needed, because we use Win32.cREATE_ALWAYS
instead of Win32.oPEN_ALWAYS
for WriteMode
.
The documentation says:
Creates a new file, always. If the specified file exists and is writable, the function overwrites the file, the function succeeds, and last-error code is set to ERROR_ALREADY_EXISTS (183).
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea
From my tests, truncation seems to work.
for
AppendMode
you also probably wantGENERIC_WRITE
mode so the user can move the pointer around.
I tried this and it seems to lead to the file pointer moving back to the start of the file, so I don't get append behavior anymore.
Also there's a small bug with
WriteMode
https://github.com/hasufell/file-io/blob/master/System/File/AbstractFilePath/Windows.hs#L27WriteMode
expects the pointer to be at the start of the file, so opening a file in write mode expects the file to be overwritten (where-as append mode appends to the end). So you have to reset the file size after opening it https://github.com/ghc/ghc/blob/378c0bba7d132a89dd9c35374b7b4bb5a4730bf7/libraries/base/GHC/IO/Windows/Handle.hsc#L856 otherwise the file won't be truncated on write.I think this is not needed, because we use
Win32.cREATE_ALWAYS
instead ofWin32.oPEN_ALWAYS
forWriteMode
.The documentation says:
Creates a new file, always. If the specified file exists and is writable, the function overwrites the file, the function succeeds, and last-error code is set to ERROR_ALREADY_EXISTS (183).
Yes but this means that errno is different in your implementation than what Base returns. It may not matter for most use cases but It's not the expected result value for GHC programs. Anything checking GetLastError ()
will diverge with your implementation.
I also now notice an additional problem... with mio
you get the in-process locking code because hANDLEtoHandle
eventually calls mkFD
which registers the locks, but for winio
the helper was added for console handles not files.
Since you're relying on the very latest win32 I can just add file support to hANDLEtoHandle
which should work..
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea
From my tests, truncation seems to work.
for
AppendMode
you also probably wantGENERIC_WRITE
mode so the user can move the pointer around.I tried this and it seems to lead to the file pointer moving back to the start of the file, so I don't get append behavior anymore.
Hmmm.. are you sure you tested the right thing? access attributes don't influence the disposition. See [2]. The problem here is that the differing attributes change what happens during subsequent calls to CreateFile
.
For AppendMode you use FILE_APPEND_DATA, while GENERIC_WRITE is a combination of ALL write attributes including FILE_WRITE_ATTRIBUTES and FILE_WRITE_DATA.
[1] https://docs.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants [2] https://docs.microsoft.com/en-us/windows/win32/fileio/creating-and-opening-files
Since you're relying on the very latest win32 I can just add file support to hANDLEtoHandle which should work..
Sounds good.
Hmmm.. are you sure you tested the right thing? access attributes don't influence the disposition. See [2]. The problem here is that the differing attributes change what happens during subsequent calls to CreateFile.
@@ -21,5 +21,5 @@
accessMode = case iomode of
ReadMode -> Win32.gENERIC_READ
WriteMode -> Win32.gENERIC_WRITE
- AppendMode -> Win32.fILE_APPEND_DATA
+ AppendMode -> Win32.gENERIC_WRITE .|. Win32.fILE_APPEND_DATA
ReadWriteMode -> Win32.gENERIC_READ .|. Win32.gENERIC_WRITE
What it now does is write to the file from the start, without truncating. If I revert the patch, appending works again.
*System.File.AbstractFilePath> readFile "lol.txt"
"tzaaakk"
*System.File.AbstractFilePath> appendFile "lol.txt" "123"
*System.File.AbstractFilePath> readFile "lol.txt"
"123aakk"
So, I think Win32.gENERIC_READ .|. Win32.fILE_APPEND_DATA
works and https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointer indicates that gENERIC_READ
might be enough to set the file pointer.
The alternative is:
--- a/System/File/AbstractFilePath/Windows.hs
+++ b/System/File/AbstractFilePath/Windows.hs
@@ -8,6 +8,7 @@ import System.AbstractFilePath.Windows ( WindowsFilePath )
import qualified System.Win32 as Win32
import qualified System.Win32.WindowsString.File as WS
+import Control.Monad (when)
#if defined(__IO_MANAGER_WINIO__)
import GHC.IO.SubSystem
#endif
@@ -31,12 +32,18 @@ openFile fp iomode = bracketOnError
#endif
Nothing)
Win32.closeHandle
- Win32.hANDLEToHandle
+ toHandle
where
+ toHandle h = do
+ when (iomode == AppendMode ) $ do
+ r <- Win32.setFilePointerEx h 0 Win32.fILE_END
+ err <- Win32.getLastError
+ when (r == 4294967295 && err /= 0) $ Win32.failWith "openFile" err
+ Win32.hANDLEToHandle h
accessMode = case iomode of
ReadMode -> Win32.gENERIC_READ
WriteMode -> Win32.gENERIC_WRITE
- AppendMode -> Win32.fILE_APPEND_DATA
+ AppendMode -> Win32.gENERIC_WRITE .|. Win32.fILE_APPEND_DATA
ReadWriteMode -> Win32.gENERIC_READ .|. Win32.gENERIC_WRITE
createMode = case iomode of
ping
Ah sorry,
The alternative is:
I prefer the alternative as that's what we currently do, I put up https://github.com/haskell/win32/pull/202 to fix the API so you can clean up the code. setFilePointerEx
like the other Win32 calls throws an exception, so you should just catch it and close the handle.
I will merge https://github.com/haskell/win32/pull/202 after work today.
+ toHandle h = do + when (iomode == AppendMode ) $ do + r <- Win32.setFilePointerEx h 0 Win32.fILE_END + err <- Win32.getLastError + when (r == 4294967295 && err /= 0) $ Win32.failWith "openFile" err + Win32.hANDLEToHandle h
Just realized that the API was already correct, you don't need this, on failure the API raises an exception. You just need a finally
here to always close the handle.
Cheers, did the dependent changes on this get merged get so the CI passes? If so I can merge this one.
Cheers, did the dependent changes on this get merged get so the CI passes? If so I can merge this one.
I'll add one more test suite to filepath that proves that old and new versions are equivalent. Maybe it's overkill, but I like to sleep well. Then this can be merged.
I don't know which GHC version will be the first to depend on filepath-2.0 (maybe 9.6 or later), so you maybe want to maintain two branches for Win32?
Merged and published as a package candidate: https://hackage.haskell.org/package/filepath-2.0.0.0/candidate
I added the candidate .tgz manually to the cabal.project
file and made sure that appveyor.yml
doesn't overwrite it. But I can't see the CI logs.
Oh missed the last message, will sort out the CI failure and merge this weekend if you're happy with the changes @hasufell ?
Oh missed the last message, will sort out the CI failure and merge this weekend if you're happy with the changes @hasufell ?
Yeah... but note that I haven't released the new filepath version yet. There's some minor things I'm still sorting out, so no rush. API won't really change though.
cool just let me know :)
@Mistuke just a heads-up... I want to release the new filepath in 1-2 weeks. If you're planning a release, maybe we should synchronize. I think the idea was that this makes it into GHC-9.6.
OK, CI should be fixed now, next commit should trigger the internal and external one.
System\Win32\SymbolicLink\Internal.hsc:13:1: error: [-Wunused-imports, -Werror=unused-imports]
The import of `Data.Bits' is redundant
except perhaps to import instances from `Data.Bits'
To import instances alone, use: import Data.Bits()
|
13 | import Data.Bits ((.|.))
| ^^^^^^^^^^^^^^^^^^^^^^^^
System\Win32\SymbolicLink\Internal.hsc:15:1: error: [-Wunused-imports, -Werror=unused-imports]
The import of `System.Win32.File' is redundant
except perhaps to import instances from `System.Win32.File'
To import instances alone, use: import System.Win32.File()
|
15 | import System.Win32.File ( failIfFalseWithRetry_ )
It's showing some unused imports, could you check the build with -Werror
enabled?
trying to retrigger CI
alright, the external CI now shows the same error. (see Appveyor)
The CI fails, because both filepath and bytestring have a lower bound on base >=4.9
and that's unlikely to change. ghc-8.0.2 is really the oldest we care about. I can't even install ghc-7.6.3 on linux.
The CI fails, because both filepath and bytestring have a lower bound on
base >=4.9
and that's unlikely to change. ghc-8.0.2 is really the oldest we care about. I can't even install ghc-7.6.3 on linux.
Well, Win32 doesn't need to run on Linux :) Please put the exports for the FPP stuff behind an
if impl(ghc >= 8.0)
as I don't want to lose compatibility with older GHCs, we've worked very hard to maintain it.
The CI fails here, because cabal build
also pulls filepath into the resolution, even if it's not needed, because it's listed in cabal.project as source-repository package.
There are two known workarounds:
ghc --version
or soI'm not sure how to fix the private CI.
We already had a dependency on filepath before, could we just add the constrain on the version of filepath needee under the FFp path?
We already had a dependency on filepath before, could we just add the constrain on the version of filepath needee under the FFp path?
This is not about Win32.cabal
. It's about cabal.project
. The cabal ticket is here https://github.com/haskell/cabal/issues/5444
Whether you use source-repository-package
or packages
results in the same issue.
Ahh ok. I Suppose we only need that because filepath hasn't been released yet?
So should it go green once it is?
Sent from my Mobile
On Thu, Jul 14, 2022, 10:42 Julian Ospald @.***> wrote:
We already had a dependency on filepath before, could we just add the constrain on the version of filepath needee under the FFp path?
This is not about Win32.cabal. It's about cabal.project. The cabal ticket is here haskell/cabal#5444 https://github.com/haskell/cabal/issues/5444
Whether you use source-repository-package or packages results in the same issue.
— Reply to this email directly, view it on GitHub https://github.com/haskell/win32/pull/198#issuecomment-1184228270, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI7OKIWLMYS4IAPUQU2AI3VT7OHTANCNFSM5RR7C4KA . You are receiving this because you modified the open/close state.Message ID: @.***>
So should it go green once it is?
That depends on whether this is the only issue with this PR. I can't tell, because I don't know how to install these ancient GHCs.
Can't you add --allow-newer
to allow cabal to ignore the upper bounds to
test on the CI?
Sent from my Mobile
On Thu, Jul 14, 2022, 10:55 Julian Ospald @.***> wrote:
So should it go green once it is?
That depends on whether this is the only issue with this PR. I can't tell, because I don't know how to install these ancient GHCs.
— Reply to this email directly, view it on GitHub https://github.com/haskell/win32/pull/198#issuecomment-1184241786, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI7OKOL36YIJL37GGQ3MVDVT7PY7ANCNFSM5RR7C4KA . You are receiving this because you modified the open/close state.Message ID: @.***>
Can't you add
--allow-newer
to allow cabal to ignore the upper bounds to test on the CI? …
Doubt so, because base
is not reinstallable.
@Mistuke the private CI fails and I think this is not a sustainable workflow for contributors. I can't see the logs and don't wanna invest more time in blind-folded CI fixing. The public CI succeeds, feel free to adjust this PR.
@Mistuke is this making a hackage release?
@Mistuke is this making a hackage release?
Yes https://hackage.haskell.org/package/Win32 I'll fix the docs later when I feel a bit better.
Thanks. Did you catch something in the amazonas?
Thanks. Did you catch something in the amazonas?
haha, luckily no, just exhausted.
Depends on: https://github.com/haskell/filepath/pull/103
For testing file operations, you can use https://github.com/hasufell/file-io (also please review the windows module)