Closed Mitkee closed 7 months ago
As a workaround:
public static class ParquetCachePopulator
{
public static async Task PopulateParquetTypeCache<T>() where T: new()
{
var row = new List<T>() { new T() };
using var stream = new MemoryStream();
await ParquetSerializer.SerializeAsync(row, stream);
stream.Position = 0;
await ParquetSerializer.DeserializeAsync<T>(stream);
}
}
..and then running
//Pre-load parquet types:
await ParquetCachePopulator.PopulateParquetTypeCache<StoredContract>();
await ParquetCachePopulator.PopulateParquetTypeCache<StoredPublicTrade>();
await ParquetCachePopulator.PopulateParquetTypeCache<FullOrderEventParquetModel>();
await ParquetCachePopulator.PopulateParquetTypeCache<OrderEventParquetModelForFtpExport>();
await ParquetCachePopulator.PopulateParquetTypeCache<PrivateTradeEventToOrderMap>();
await ParquetCachePopulator.PopulateParquetTypeCache<TradeEventParquetModel>();
pre-startup might very well do the trick. I'll give that a spin and will report back in a few days to see if it stays stable. Now I'm not against pre-loading known types, could even consider making that mandatory to avoid the issue? 🤔 Alternatively, fix the concurrency problem in the type cache, which will most likely come with some minor performance hit.
Library Version
4.16.3
OS
Windows
OS Architecture
64 bit
How to reproduce?
Unfortunately, no clear steps to reproduce, but seen this happening occasionally in production:
Since the _typeToAssembler cache is not concurrent, triggering multiple operations populating this cache in parallel is the cause.
Failing test