reactphp / filesystem

Evented filesystem access.
MIT License
135 stars 40 forks source link

racing condition while writing to file #75

Closed Saso closed 4 years ago

Saso commented 4 years ago
    $this->file->open('a' )->then(
        function(\React\Stream\WritableStreamInterface $stream) use ($msg) {
            $stream->end("{$msg}\n");
        },
        function( $e ) use ($msg) {
            echo "error in log: {$e->getMessage()}\n";
            if ($e->getPrevious()) {
                echo "\t".$e->getPrevious()->getMessage() . PHP_EOL;
            }
            echo ".. while writing msg: {$msg}\n";
        }
    );

working as expected, adding line after line. Except in case, when many objects use the same code, to write to same file at the same time. Errors shown are "timeouts after 15 sec" or "file already open". I implemented singleton locking, but with no effect, due to async working of ReactPHP.

Possible solution would be locking at file->open() (per file) or adding some flags to this call. Maybe even file->lock() / unlock() / isLocked() implementation. Or file->isOpen(), to wait, until it closed by other async thread.

If my understanding is wrong, please, correct me, how to avoid racing conditions, as I need frequent open+write+close cycles for my use.

ghost commented 4 years ago

Adding a lock mechanism is not possible. It's not supported by all adapters.

Even mandatory locking is on a linux kernel not guaranteed and has race conditions.

You'll need to create some sort of stream to write all your data and only open the file once. Use that file descriptor (a stream is returned by the File object) to write all data from your stream.

WyriHaximus commented 4 years ago

Have you considered sharing the $stream stream and only open the file once @Saso? Race conditions are kinda implied when dealing with non-blocking systems are you never guaranteed in what order parts execute. So when you have multiple components dealing with one datasource, wheither that's redis, a file, postgresql only use one entry point for that and sure connections as much as you dan.

Saso commented 4 years ago

thank you for your answers. This solution would solve racing between file openings, but not actual writes. (please, correct me, if I am wrong) As I understand, in this case, it could just write to the file one-over-the-other, corrupting it in the process. As I am writing financial data into files, this is not an option for me.

clue commented 4 years ago

This solution would solve racing between file openings, but not actual writes. (please, correct me, if I am wrong)

If you open a file in append mode, it should not overwrite the existing contents, but rather append to it. Depending on your system, there are some guarantees attached to this, so this may or may not work depending on your particular use case: https://stackoverflow.com/questions/1154446/is-file-append-atomic-in-unix

On top of this, you can also use a similar logic to serialize your writes (i.e. bring them to a defined order) and make sure all writes happen on a single file instance.

As I am writing financial data into files, this is not an option for me.

This again depends on your use case, but if you're writing financial data and need better guarantees, direct filesystem access may not be the best choice. It's fairly common to go for a more transactional system in this case (different databases provide different atomic consistency guarantees).

I believe this has been answered, so I'm closing this for now. Please come back with more details if this problem persists and we can reopen this :+1: