multitheftauto / mtasa-blue

Multi Theft Auto is a game engine that incorporates an extendable network play element into a proprietary commercial single-player game.
https://multitheftauto.com
GNU General Public License v3.0
1.38k stars 424 forks source link

Async File IO #1815

Open LosFaul opened 3 years ago

LosFaul commented 3 years ago

Is your feature request related to a problem? Please describe.

It always bothered me that file operations can block the main thread and aren't async. It would be great if file functions could be extended with async options.

Describe the solution you'd like

Async options for reading and writing: fileRead([callback[, callbackArgsTbl], ] fileHandle, ...) fileWrite([callback[, callbackArgsTbl], ] fileHandle, ...)

function onReadCallback(fileHandle, strRead, arg1, arg2, ...)

end

fileRead(onReadCallback, { arg1, arg2, ... }, fileHandle, ...)

Describe alternatives you've considered

using server modules is an option, but it would be great if it would be actually an inbuild solution in MTA

Pirulax commented 3 years ago

Lets do it! Im in 👍 Just remind me, although I wont be adding that callback args thing, I feel like you can always make a wrapper for yourself if you really want to. For setTimer it kinda makes sense, but here, it just doesnt, you can do this instead:

function onReadCallback(strRead, arg1, arg2, ...)

end

fileRead(onReadCallback, function(strRead) onReadCallback(strRead, arg1, arg2, ... ) end, fileHandle, ...)

Maybe not as neat; but I think you should prefer this one anyways:

fileRead(onReadCallback, 
    function(strRead) 
         -- onReadCallback body here
    end, 
fileHandle, ...)

(allocating a Lua table itself isnt horribly slow, but copying it into our specialized CLuaArgument crap is, especially if you have a lot of nested tables)

TheNormalnij commented 3 years ago

We can different names for async functions. fileReadA for example.

Pirulax commented 3 years ago

Or perhaps just use AsyncTaskManager, and the regular functions, its easier than writing 2 separate things (for linux and win)

Pirulax commented 3 years ago

Okay, so I've looked into it.. It reaises a lot of questions:

4O4 commented 3 years ago

Very good questions. I'll try to share my thoughts.

Keep in mind that the underlaying file can get closed any time due to element deletion / garbage collection, should we make the File* ref counted?

Yes, it should be guaranteed that if the callback is executed, then the file handle is valid for the whole duration of that callback execution. If for some reason anything bad happens with that handle and we can't do anything about that (don't know if this is a valid concern), a Lua-level error should be raised to stop further processing of that callback with a meaningful panic / error message.

What should the output be if the user writes to the file while we are reading it in a secondary thread?

I don't know how the async scheduler works now, but I think it should be guaranteed that the worker thread starts processing the async io request in the next tick or later, not immediately in the current one. So this would be a "deferred" call. This probably solves one class of read/write clashes between sync and async calls on the same files.

As for clashes between multiple async calls on the same file I think all of these operations should be queued and executed in the order they were requested. So each file would need to have it's own queue if an async io operations were requested for it. dbQuery and fetchRemote calls are handled in a similar way with the queues (but custom names for these queues can be specified, and in case of async file read/writes custom queue names probably shouldn't be allowed as it would be just an internal mutex-like mechanism).

Waiting for some more comments and ideas if anyone has any. Also for more inspiration and solutions to potential problems we could look into Node.js implementation of async io operations. They did it very well.

4O4 commented 3 years ago

After giving it some more thoughts, my suggestion is to make more file APIs asynchronoous, not only the read / write functions, but also open, close, delete and others. Basically all the functions that do any actual work on the OS level, so probably everything except fileGetPath, fileGetPos/fileSetPos and fileGetSize (but that's just blind guess, haven't looked at the code). Combined with #1857 it would make a really good, truly asynchronous filesystem api comparable to fs Promises API in the Node.js: https://nodejs.org/api/fs.html#fs_fs_promises_api

Semi-related offtopic about DB, click only if you're interested In a similar way it might be a good idea to have an async version of `dbConnect` but this probably would need a separate issue for discussion. The reasoning behind it is that - as someone pointed out recently on #development channel - connecting to databases might actually take a considerable amount of time when dealing with remote databases. On full-local deployments this is a non issue as the latency is near-zero. And I believe it's not uncommon for gamemodes which use databases extensively to have multiple db connections for better parallelism and also having a need to create new connections on demand when dealing with connection pool mechanisms. As I said this needs its own issue, but for now I want to see how the discussion for the current topic will go.