nesrak1 / AssetsTools.NET

Read and write unity assets/bundle files, based on https://github.com/SeriousCache/UABE
MIT License
518 stars 101 forks source link

[Suggestion] Thread safety #124

Open Albeoris opened 1 year ago

Albeoris commented 1 year ago

Currently classes such as AssetsManager and AssetsFileInstance are not thread safe. The most interesting methods are LoadAssetsFile, GetBaseField and GetExtAsset.

Due to non-thread-safe caches inside, this does not allow parallelization of work with a large number of asset files on fast SSD. The thread-safe wrapper over AssetManager provides no benefit due to blocking IO operations inside.

Note: if anyone wants to implement this suggestion, it is probably best to write a thread-safe AsyncAssetManager from scratch, since it is very easy to get things wrong due to the large number of collections and non-thread-safe operations inside the AssetManager.

nesrak1 commented 1 year ago

86, #87. It's highly requested but I still don't know the best way to implement this. I will need it at some point for UABEANext which is currently manually handling multithreading.

Albeoris commented 1 year ago

I still don't know the best way to implement this

Beware, Long Read. I don’t know how useful this will be, but maybe it will give you some ideas.

It seems to me that command queues for accessing shared resources are best suited here.

For example, to access the file system we can write such interface:

internal interface IFileAccessor
{
    void Read(Int32 offset, Memory<Byte> buffer);
}

And manager to store list of accessors:

internal sealed class FileManager
{
    private readonly ConcurrentDictionary<String, IFileAccessor> _files = new(StringComparer.InvariantCultureIgnoreCase);

    public IFileAccessor Access(String filePath)
    {
        return _files.GetOrAdd(Path.GetFullPath(filePath), path => new FileAccessQueue(path));
    }
}

When we need to read a part of file we can do it via IFileAccessor:

Byte[] Read(String assetPath, Int32 offset, Int32 size)
{
    Byte[] buff = new Byte[size];
    IFileAccessor accessor = this.FileManager.Access(assetPath);
    accessor.Read(offset, buff); // blocking magic here
    return buff;
}

And very simple implementation of commands queue:

internal class FileAccessQueue : IFileAccessor
{
    private readonly FileStream _fileStream;

    public FileAccessQueue(string path)
    {
        _fileStream = File.OpenRead(path);
    }

    public void Read(Int64 offset, Memory<Byte> buffer)
    {
        lock (_fileStream) // we can replace it with ConcurrentQueue + background thread that will process it to avoid locks 
        {
            if (_fileStream.Position != offset)
                _fileStream.Seek(offset, SeekOrigin.Begin);
            _fileStream.Read(buffer.Span);
        }
    }
}

The main idea is to isolate data access that requires synchronization from business logic, minimizing competition for the same data between threads. And avoid passing any callbacks to such methods, which will increase the lock acquisition time.

In some cases, like type resolving, makes sense to immediately delegate this activity to one single thread; this ensures that different threads that process different Asset Files do not start analyzing the same types from different ends and do not get into a deadlock:

internal sealed class TypeResolver
{
    private readonly ConcurrentDictionary<Object, Object> _resolved = new();
    private readonly ConcurrentQueue<ResolveTypeCommand> _queue = new();
    private readonly AutoResetEvent _queued = new(initialState: false);
    private readonly Thread _dispatchingThread;

    public TypeResolver()
    {
        _dispatchingThread = new Thread(DispatchQueue) { IsBackground = true };
    }

    private void DispatchQueue()
    {
        while (true)
        {
            _queued.WaitOne();

            while (_queue.TryDequeue(out ResolveTypeCommand? command))
                command.ResolveNoThrow();
        }
    }

    public Object ResolveTypeInSingleThread(Object input)
    {
        if (_resolved.TryGetValue(input, out Object? typeDescriptor))
            return typeDescriptor;

        ResolveTypeCommand command = new(input);
        _queue.Enqueue(command);
        _queued.Set();

        command.Completed.WaitOne();
        if (command.Error is null)
            return command.Result;

        ExceptionDispatchInfo.Throw(command.Error); // to keep original stack
        throw command.Error;
    }

    private sealed class ResolveTypeCommand
    {
        private readonly Object _input;

        public ResolveTypeCommand(Object input)
        {
            _input = input;
        }

        public void ResolveNoThrow()
        {
            try
            {
                // do something
                Result = ...;
            }
            catch (Exception ex)
            {
                Error = ex;
            }
            finally
            {
                Completed.Set();
            }
        }

        public ManualResetEvent Completed { get; } = new ManualResetEvent(initialState: false);
        public Object Result { get; private set; }
        public Exception Error { get; private set; }
    }
}

This way, the only owner of the lock on resolving types can be this background thread, this ensures that we do not encounter deadlocks. At the same time, this thread can read data from files, since IFileAccessor does not contain callbacks, and the blocking time is extremely short.

image

And in the final stage, with asynchronous code and a bunch of wrappers, it might look something like this:

    public async Task ReadObjectAsync(AssetFile assetFile, ObjectInfo objectInfo, CancellationToken ct)
    {
        var fileAccessor = FileManager.Access(assetFile);

        var header = await fileAccessor.ReadHeaderAsync(objectInfo, ct);
        var type = await TypeResolver.ResolveAsync(header.Type, ct);

        foreach (var field in type.Fields)
        {
            ///
        }
    }
nesrak1 commented 12 months ago

After reading this over a bit, I think I now understand your suggestion. This sounds somewhat similar to what I was kind of suggesting but with async rather than just locking lists and dictionaries and stuff.

I think the thing that takes the biggest chunk of time (ignoring disk speed which obviously we can't fix) is asset deserialization. The deserialization code reads individual ints and floats and strings and whatnot from the stream which is very big bad. My first suggestion was that instead of having the deserialization also read the file at the same time, we should just read the entire asset into memory and then we would have the full asset in memory that can be deserialized at the same time as other assets are being deserialized without conflicting. We could do the same with reading the metadata (file listing, typetree, etc.) for AssetsFiles, but usually these read pretty fast, so I don't know if the header needs it.

Besides dll loading for the mono.cecil temp generator (which could just be locked), I think generating AssetTypeTemplateFields is quick and thread safe already. The tpk doesn't change once it's loaded and neither does the type tree.

Arguably the worst part will be the caching parts which I think will just have to be a lot of locking (unless you know of a better way.)

I think the async version of AssetsManager could be a good option, but I still think we can make the original AssetsManager thread safe as well.

But, IANAM (I am not a multithreading expert)

Albeoris commented 11 months ago

we should just read the entire asset into memory Of course, if it's small enough to fit in memory, that would be great. As far as I understand, the main problem is that reading dependencies can load the entire asset tree, and if we are talking about caching the entire .asset file, then this can become a problem for users with < 128 GB of RAM.

If you are talking about loading only a piece inside the asset that we are currently reading, then this does not really matter IF you read the file sequentially and do not use stream.Seek. When reading ints, FileStream reads not 4 bytes, but a buffer (4096 bytes by default), so sequential reading of a file fragment by parts of 4 bytes is fast as reading by 4096 bytes. But every call to stream.Seek reset this buffer, so you should avoid calls to stream.Seek and try to read sequentially.

nesrak1 commented 11 months ago

Of course, if it's small enough to fit in memory, that would be great. As far as I understand, the main problem is that reading dependencies can load the entire asset tree, and if we are talking about caching the entire .asset file, then this can become a problem for users with < 128 GB of RAM.

So in this case I'm talking about each asset, not an .assets file (I'll just call an .assets file a serialized file from now on to prevent confusion.) "Loading" a serialized file in at.net is loading its header and metadata (including file table) but not loading any assets. Assets themselves are usually pretty small, usually around the hundreds of bytes. Although yes, sometimes we have a 50mb shader or mesh occasionally so we might have to do partial asset loading in those cases.

Loading a serialized file could take a significant amount of time for large files, but I'd need to add an option to skip the serialized file's Read() function from reading the metadata and let the user (or AssetsManager) call Metadata.Read() themselves.

then this does not really matter IF you read the file sequentially and do not use stream.Seek

This isn't possible since we can load two assets at one time and both would require us to jump to different start locations.