janusmiracle / silver

A command-line tool and library for reading extensive information from a wide range of file formats.
Other
0 stars 0 forks source link

RF64 Performance #5

Open janusmiracle opened 6 days ago

janusmiracle commented 6 days ago

The library has a very noticeable delay when dealing with RF64 files.

janusmiracle commented 4 days ago

The issue stemmed from the excessively large chunks in RF64, particularly the data chunk. In the previous implementation, using stream.read(chunk_size) resulted in a significant loss in performance. The solution was pretty straightforward: since the chunk_data for that chunk was unnecessary, I simply returned an empty string. I skipped the JUNK, FLLR, and PAD chunk data as well. I also introduced an option to ignore ProTool chunks, as I haven't found any documentation on them. This greatly improves performance, especially for RF64 files.

When benchmarked on a 6.26GB ProTools RF64 file, the results were as follows:

['bext', 'fmt ', 'minf', 'elm1', 'data', 'FLLR', 'regn', 'umid', 'DGDA']
0.101348876953125 seconds with ProTools chunk data.
['bext', 'fmt ', 'minf', 'elm1', 'data', 'FLLR', 'regn', 'umid', 'DGDA']
0.002869129180908203 seconds with ProTools chunk data ignored.

For comparison, the previous benchmark for the same file without these performance enhancements was:

['bext', 'fmt ', 'minf', 'elm1', 'data', 'FLLR', 'regn', 'umid', 'DGDA']
2.1404500007629395 seconds.

Instead of using a boolean ignore parameter, it might be more effective to implement an integer setting (or a separate option) to ignore all chunk data not deemed important (e.g. fmt). Alternatively, users could provide a list of chunks to ignore or retain.

janusmiracle commented 4 days ago

Add an option to output all chunk_data such as data, JUNK, FLLR, etc. Default to False.