Closed jkbonfield closed 1 year ago
I think this is treating the symptom rather than the cause. The MinGW-w64 lseek64()
seems to be a simple wrapper around _lseeki64()
. According to the Microsoft _lseeki64
documentation, "On devices incapable of seeking (such as terminals and printers), the return value is undefined." So it's not clear how this ever worked...
I think the full solution might be an fstat()
on the file descriptor in hopen_fd()
and hdopen()
to check if it's a regular file. The result can be added to the bitfield in hFILE_fd
. Then fd_seek()
could simply check this on Windows, and refuse to seek on non-regular files. This would catch all attempts to seek, and not just the BGZF EOF check.
Undefined doesn't mean it never worked. It just means they're free to change their mind on implementation. The official posix definition of lseek though is clear, and MinGW has attempted to implement POSIX semantics even when the underlying microsoft interface doesn't. Obviously it can't catch all scenarios though.
My fix was the minimal and most simple solution to get bcftools CI working once more, so I think it's still worth doing for now.
A more invasive change to htslib will inevitably mean a longer development and testing time, especially given other commitments this week. Would you consider doing both?
Edit: also see https://github.com/samtools/bcftools/issues/1901#issuecomment-1503090263 last paragraph. Firstly, this worked with gcc 11 but not with 12. Upgrading gcc also upgrades all sorts of other associated stuff, so it's unclear if it's compiler or the libc which has the change, but regardless it demonstrates my comment about about mingw normally working around windows posix breakages, which is why we use it. Your alternative is also basically what I outlined too, although I hadn't thought of using fstat
.
My fix was the minimal and most simple solution to get bcftools CI working once more, so I think it's still worth doing for now.
It's not minimal though: having bgzf_check_EOF_common()
always return 2 means that you have to hack _testbgzf.c too. (And it appears it changes functionality on Windows, suppressing the EOF check for ordinary files too.)
There's an example of the sort of _WIN32
-specific code needed next door in fd_write()
; wouldn't it suffice to add similar in fd_seek()
:
static off_t fd_seek(hFILE *fpv, off_t offset, int whence)
{
hFILE_fd *fp = (hFILE_fd *) fpv;
+#ifdef _WIN32
+ if (GetFileType((HANDLE)_get_osfhandle(fp->fd)) == FILE_TYPE_PIPE) { errno = ESPIPE; return -1; }
+#endif
return lseek(fp->fd, offset, whence);
}
That's equivalent to Rob's suggestion, except on Windows it does the check on every seek instead of doing it just once on Windows when opening the file and recording the answer.
Yes, that would be a simpler fix albeit less efficient as the check really only needs to be done once. I'd prefer to look for GetFileType((HANDLE)_get_osfhandle(fp->fd)) != FILE_TYPE_DISK
as it would also catch character devices, which also can't seek.
I don't think MinGW has ever claimed to implement POSIX semantics. It's lseek()
is just a wrapper around the Microsoft C runtime function, so what happens depends on what that does and not on anything in MinGW.
I expect cram_check_EOF()
on a pipe would be affected as well as bgzf_check_EOF_common()
. Fixing hseek()
would sort that one too.
I meant minimal in code. Obviously it had the side effect of not checking, but that was the whole point - the check no longer works due to mingw changes.
I wasn't aware of how to do this using windows native API, but that code snippet shows the way. Thanks.
I wasn't aware of how to do this using windows native API, but that code snippet shows the way. Thanks.
The irony of this statement is, those lines of code were written by myself! Haha. Apparently I once did know how to do this, or managed somewhat better on search terms in google and/or stackoverflow to set the ball rolling. :)
Anyway I've confirmed it's fixing (again) my local mingw build of bcftools (glacially), so will work on getting this PR revised.
I've managed to reproduce the bug on Windows server 2019 using MinGW-w64 in both MINGW64 and UCRT64 modes, and can confirm that the revised patch fixes both.
An easy way to reproduce the problem was this:
cat test/ce#large_seq.tmp.bam | ./test/test_view.exe -p /dev/null -
test/ce#large_seq.tmp.bam
is made by running make test
and at 1.7M is plenty big enough to trigger the issue.
At some point we may want to look into making file access on Windows use the more native APIs. The POSIX-like ones seem to have some quirks, e.g. msvcrt _open()
returns error "No such file or directory" when trying to open a named pipe. The MSYS2 wc
command can do that, so it must be possible somehow...
Meanwhile I'll merge this so that bcftools' tests will become more reliable again.
Unfortunately this test is proving unreliable now following a change to lseek so that it no longer returns -1 on pipes.
Fixes samtools/bcftools#1901