scott-griffiths / bitstring

A Python module to help you manage your bits
https://bitstring.readthedocs.io/en/stable/index.html
MIT License
401 stars 67 forks source link

Opening a file using filename keyword seems to lock the file as if it's in write mode #308

Closed MichaelDeutschCoding closed 4 months ago

MichaelDeutschCoding commented 7 months ago

The documentation says then when creating a ConstBitStream using the filename keyword in the constructor "This will open the file in binary read-only mode." I would therefore expect that to give the same experience as opening the same filename using Python's builtin open(filepath, mode='rb').

But I've noticed that other processes fail when attempting to save to that file path so long as the ConstBitStream is in scope. However the regular open() does not lock the file (if open in read-only mode) even if I don't use a context manager.

Repro:

import traceback
from bitstring import ConstBitStream

path = 'path/to/some_file.txt'

left_open = open(path, mode='rb')

with open(path, mode='w') as f:
    f.write('This write works fine')

left_open.close()

bitstream = ConstBitStream(filename=path)

try:
    with open(path, mode='w') as f:
        f.write('Will this work?? (spoiler: no)')
except Exception as e:
    print(traceback.format_exc())

del bitstream

with open(path, mode='w') as f:
    f.write('Final write works after deleting the bitstream.')

VERSION: 4.1.4

scott-griffiths commented 7 months ago

Hi. Thanks for the bug report. Could you provide some additional platform details on where you're running this as I'm unable to reproduce the issue? Also the contents of the traceback would be handy.

I've run your test on my Windows 11 machine (under WSL) and no exception is thrown and all the writes work for me.

Thanks.

MichaelDeutschCoding commented 6 months ago

Hey Scott, That's interesting. I wonder what I may have different about my set up?

I'm running it on Windows 11 directly (not under WSL) and this happens whether I run it interactively, in IDLE, from a terminal, or elevated terminal.

The stack trace isn't particularly helpful imo, but here ya go:

PS C:\Users\MikeD\Desktop> py .\bitstream_test.py
Traceback (most recent call last):
  File "C:\Users\MikeD\Desktop\bitstream_test.py", line 16, in <module>
    with open(path, mode='w') as f:
         ^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 22] Invalid argument: 'some_file.txt'

And do keep in mind that previous and subsequent reads/writes within that same progam all work . When I open the file afterwards, it has the contents Final write works after deleting the bitstream.

scott-griffiths commented 6 months ago

Thanks. I have previously had file related problems that only occurred when unit tests were run on Windows. The error is strange though - my first thought would be that the path contains backslashes or a space that's being misinterpreted but it looks OK.

I'll try it later when I'm back in Windows-land.

MichaelDeutschCoding commented 5 months ago

Hey @scott-griffiths , just curious -- any updates?

scott-griffiths commented 5 months ago

Hi. Yes I've reproduced it locally but I don't have a fix yet. It's Windows specific and I don't have a dev environment set up to easily debug it.

I'll mark it as a bug to be fixed in 4.2, so I'll get around to it eventually! The work-around is just to read the file into memory, assuming that it's not too big, by using a BitStream instead of ConstBitStream.

scott-griffiths commented 4 months ago

I'm almost certain the issue is due to the different way that Windows handles memory mapped files.

For efficiency reasons the file is accessed via a read-only mmap, which works fine on all platforms. However Windows' file system locks the file that's been mapped against writing, while Unix-like systems don't.

There's not a huge amount I can do about the underlying problem, as it's a 'feature' of the OS. I could:

  1. Opt not to use the mmap when on Windows. This isn't really a good option as it would instead break it in terms of expected performance on that OS.
  2. Document the problem. Hardly a great 'solution' (who reads the documentation?) but I'm struggling to find a better option.

I can't even make the exception more useful as it's not from my code. The OSError: Invalid argument exception is just misleading as it's not the argument that's the problem.

scott-griffiths commented 4 months ago

This is now documented and so officially a 'feature' rather than a bug. Not very helpful I know but it's due to the OS, and has a poor error message from the core Python.